This is a dataset taken from the StatLib library which is maintained at
Carnegie Mellon University.
Samples contain 13 attributes of houses at different locations around the
Boston suburbs in the late 1970s. Targets are the median values of
the houses at a location (in k$).
The attributes themselves are defined in the
StatLib website.
Args
path
path where to cache the dataset locally
(relative to ~/.keras/datasets).
test_split
fraction of the data to reserve as test set.
seed
Random seed for shuffling the data
before computing the test split.
Returns
Tuple of NumPy arrays: (x_train, y_train), (x_test, y_test).
x_train, x_test: NumPy arrays with shape (num_samples, 13)
containing either the training samples (for x_train),
or test samples (for y_train).
y_train, y_test: NumPy arrays of shape (num_samples,) containing the
target scalars. The targets are float scalars typically between 10 and
50 that represent the home prices in k$.
[null,null,["Last updated 2024-06-07 UTC."],[],[],null,["# tf.keras.datasets.boston_housing.load_data\n\n|----------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/keras-team/keras/tree/v3.3.3/keras/src/datasets/boston_housing.py#L7-L70) |\n\nLoads the Boston Housing dataset. \n\n tf.keras.datasets.boston_housing.load_data(\n path='boston_housing.npz', test_split=0.2, seed=113\n )\n\nThis is a dataset taken from the StatLib library which is maintained at\nCarnegie Mellon University.\n| **Warning:** This dataset has an ethical problem: the authors of this dataset included a variable, \"B\", that may appear to assume that racial self-segregation influences house prices. As such, we strongly discourage the use of this dataset, unless in the context of illustrating ethical issues in data science and machine learning.\n\nSamples contain 13 attributes of houses at different locations around the\nBoston suburbs in the late 1970s. Targets are the median values of\nthe houses at a location (in k$).\n\nThe attributes themselves are defined in the\n[StatLib website](http://lib.stat.cmu.edu/datasets/boston).\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------|----------------------------------------------------------------------------|\n| `path` | path where to cache the dataset locally (relative to `~/.keras/datasets`). |\n| `test_split` | fraction of the data to reserve as test set. |\n| `seed` | Random seed for shuffling the data before computing the test split. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| Tuple of NumPy arrays: `(x_train, y_train), (x_test, y_test)`. ||\n\n\u003cbr /\u003e\n\n**x_train, x_test** : NumPy arrays with shape `(num_samples, 13)`\ncontaining either the training samples (for x_train),\nor test samples (for y_train).\n\n**y_train, y_test** : NumPy arrays of shape `(num_samples,)` containing the\ntarget scalars. The targets are float scalars typically between 10 and\n50 that represent the home prices in k$."]]