Creates a beam pipeline yielding TFDS examples.
tfds.beam.ReadFromTFDS(
pipeline,
builder: tfds.core.DatasetBuilder
,
split: str,
workers_per_shard: int = 1,
**as_dataset_kwargs
)
Used in the notebooks
Used in the tutorials |
---|
Each dataset shard will be processed in parallel.
Usage:
builder = tfds.builder('my_dataset')
_ = (
pipeline
| tfds.beam.ReadFromTFDS(builder, split='train')
| beam.Map(tfds.as_numpy)
| ...
)
Use tfds.as_numpy
to convert each examples from tf.Tensor
to numpy.
The split argument can make use of subsplits, eg 'train[:100]', only when the batch_size=None (in as_dataset_kwargs). Note: the order of the images will be different than when tfds.load(split='train[:100]') is used, but the same examples will be used.
Returns | |
---|---|
The PCollection containing the TFDS examples. |