View source on GitHub
|
Creates a beam pipeline yielding TFDS examples.
tfds.beam.ReadFromTFDS(
pipeline,
builder: tfds.core.DatasetBuilder,
split: str,
workers_per_shard: int = 1,
**as_dataset_kwargs
)
Used in the notebooks
| Used in the tutorials |
|---|
Each dataset shard will be processed in parallel.
Usage:
builder = tfds.builder('my_dataset')
_ = (
pipeline
| tfds.beam.ReadFromTFDS(builder, split='train')
| beam.Map(tfds.as_numpy)
| ...
)
Use tfds.as_numpy to convert each examples from tf.Tensor to numpy.
The split argument can make use of subsplits, eg 'train[:100]', only when the batch_size=None (in as_dataset_kwargs). Note: the order of the images will be different than when tfds.load(split='train[:100]') is used, but the same examples will be used.
Returns | |
|---|---|
| The PCollection containing the TFDS examples. |
View source on GitHub