boston_housing¶
from relational_datasets import load
train, test = load("boston_housing", "v0.0.6")
using RelationalDatasets
train, test = load("boston_housing", "v0.0.6")
Warning
The Boston Housing dataset is deprecated. It is included here for backwards compatibility and reproducing results in old publications, but should not be used for benchmarking future results.
The dataset contains a variable B
which is ethically problematic. The original dataset authors assumed that Black neighbors were undesirable, and that this would affect housing prices. However, this assumption was encoded in a way that makes it impossible to analyze further.
We recommend the "California Housing" dataset instead.
See also:
- M Carlisle, "racist data destruction?" Medium.com, retrieved: 2022-11-02
- sklearn.datasets.load_boston (archived)
"Boston Housing" is a common benchmark dataset for regression.
Task¶
Regression: Predict the median value of homes.
crim(+id,#varsrim).
zn(+id,#varzn).
indus(+id,#varindus).
chas(+id,#varchas).
nox(+id,#varnox).
rm(+id,#varrm).
age(+id,#varage).
dis(+id,#vardis).
rad(+id,#varrad).
tax(+id,#vartax).
ptratio(+id,#varptrat).
b(+id,#varb).
lstat(+id,#varlstat).
medv(+id).
Last update:
November 9, 2022