TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

TFDS CLI

TFDS CLI is a command-line tool that provides various commands to easily work with TensorFlow Datasets.

%%capture
%env TF_CPP_MIN_LOG_LEVEL=1  # Disable logs on TF import

安装

CLI 工具随 tensorflow-datasets（或 tfds-nightly）一起安装。

pip install -q tfds-nightly
tfds --version

对于所有 CLI 命令的列表：

tfds --help

此命令将通过创建包含默认实现文件的 <dataset_name>/ 目录来帮助您开始编写新的 Python 数据集。

用法：

tfds new my_dataset

tfds new my_dataset 将创建：

ls -1 my_dataset/

可选标志 --data_format 可用于生成特定格式的数据集构建器（例如，conll）。如果没有给出数据格式，它将为标准 tfds.core.GeneratorBasedBuilder 生成一个模板。有关可用的特定格式数据集构建器的详细信息，请参阅文档。

See our writing dataset guide for more info.

可用选项：

tfds new --help

使用 tfds build <my_dataset> 生成新数据集。<my_dataset> 可以是：

dataset/ 文件夹或 dataset.py 文件的路径（当前目录为空）：
- tfds build datasets/my_dataset/
- cd datasets/my_dataset/ && tfds build
- cd datasets/my_dataset/ && tfds build my_dataset
- cd datasets/my_dataset/ && tfds build my_dataset.py
注册的数据集：
- tfds build mnist
- tfds build my_dataset --imports my_project.datasets

注：tfds build 具有有用的标志来帮助完成原型设计和调试。请参阅下面的 Debug & tests: 部分。

可用选项：

tfds build --help