Skip to content

boston_housing

from relational_datasets import load
train, test = load("boston_housing", "v0.0.6")
using RelationalDatasets
train, test = load("boston_housing", "v0.0.6")

Warning

The Boston Housing dataset is deprecated. It is included here for backwards compatibility and reproducing results in old publications, but should not be used for benchmarking future results.

The dataset contains a variable B which is ethically problematic. The original dataset authors assumed that Black neighbors were undesirable, and that this would affect housing prices. However, this assumption was encoded in a way that makes it impossible to analyze further.

We recommend the "California Housing" dataset instead.

See also:

"Boston Housing" is a common benchmark dataset for regression.

Task

Regression: Predict the median value of homes.

crim(+id,#varsrim).
zn(+id,#varzn).
indus(+id,#varindus).
chas(+id,#varchas).
nox(+id,#varnox).
rm(+id,#varrm).
age(+id,#varage).
dis(+id,#vardis).
rad(+id,#varrad).
tax(+id,#vartax).
ptratio(+id,#varptrat).
b(+id,#varb).
lstat(+id,#varlstat).
medv(+id).

Last update: November 9, 2022