Module: tfds

tensorflow_datasets (tfds) defines a collection of datasets ready-to-use with TensorFlow.

Each dataset is defined as a tfds.core.DatasetBuilder, which encapsulates the logic to download the dataset and construct an input pipeline, as well as contains the dataset documentation (version, splits, number of examples, etc.).

The main library entrypoints are:

Documentation:

Modules

beam module: Beam utils.

core module: API to define datasets.

dataset_builders module: Dataset builders API.

decode module: Decoder public API.

deprecated module: Deprecated symbols.

download module: tfds.download.DownloadManager API.

features module: API defining dataset features (image, text, scalar,...).

folder_dataset module: Utils to load data comming from third party sources directly with TFDS.

testing module: Testing utilities.

transform module: Transform API.

typing module: TFDS typing annotations.

visualization module: Visualizer utils.

Classes

class GenerateMode: Enum for how to treat pre-existing downloads and data.

class ImageFolder: Generic image classification dataset created from manual directory.

class ReadConfig: Configures input reading pipeline.

class Split: Enum for dataset splits.

class TranslateFolder: Generic text translation dataset created from manual directory.

Functions

as_dataframe(...): Convert the dataset into a pandas dataframe.

as_numpy(...): Converts a tf.data.Dataset to an iterable of NumPy arrays.

benchmark(...): Benchmarks any iterable (e.g tf.data.Dataset).

builder(...): Fetches a tfds.core.DatasetBuilder by string name.

builder_cls(...): Fetches a tfds.core.DatasetBuilder class by string name.

builder_from_directories(...): Loads a tfds.core.DatasetBuilder from the given generated dataset path.

builder_from_directory(...): Loads a tfds.core.DatasetBuilder from the given generated dataset path.

data_source(...): Gets a data source from the named dataset.

dataset_collection(...): Instantiates a DatasetCollectionLoader.

disable_progress_bar(...): Disables Tqdm progress bar.

display_progress_bar(...): Controls whether Tqdm progress bar is enabled/disabled.

enable_progress_bar(...): Enables Tqdm progress bar.

even_splits(...): Generates a list of non-overlapping sub-splits of same size.

is_dataset_on_gcs(...): If the dataset is available on the GCS bucket gs://tfds-data/datasets.

list_builders(...): Returns the string names of all tfds.core.DatasetBuilders.

list_dataset_collections(...): Returns the string names of all tfds.core.DatasetCollectionBuilders.

load(...): Loads the named dataset into a tf.data.Dataset.

show_examples(...): Visualize images (and labels) from an image classification dataset.

show_statistics(...): Display the datasets statistics on a Colab/Jupyter notebook.

split_for_jax_process(...): Returns the subsplit of the data for the process.

version '4.9.3'