TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

tfds.core.SplitInfo

Wraps proto.SplitInfo with an additional property.

tfds.core.SplitInfo(
    name: str,
    shard_lengths: List[int],
    num_bytes: int,
    filename_template: Optional[naming.ShardedFileTemplate] = None,
    statistics: statistics_pb2.DatasetFeatureStatistics = dataclasses.field(default_factory=statistics_pb2.DatasetFeatureStatistics)
)

Attributes
`name`	Name of the split (e.g. `train`, `test`,...)
`shard_lengths`	List of length containing the number of examples stored in each file.
`filename_template`	The template used to create sharded filenames.
`num_examples`	Total number of examples (`sum(shard_lengths)`)
`num_shards`	Number of files (`len(shard_lengths)`)
`num_bytes`	Size of the files (in bytes)
`statistics`	Additional statistics of the split.
`file_instructions`	Returns the list of dict(filename, take, skip). This allows for creating your own `tf.data.Dataset` using the low-level TFDS values. `file_instructions = info.splits['train[75%:]'].file_instructions instruction_ds = tf.data.Dataset.from_generator( lambda: file_instructions, output_types={ 'filename': tf.string, 'take': tf.int64, 'skip': tf.int64, }, ) ds = instruction_ds.interleave( lambda f: tf.data.TFRecordDataset( f['filename']).skip(f['skip']).take(f['take']) )` When `skip=0` and `take=-1`, the full shard will be read, so the `ds.skip` and `ds.take` could be skipped.
`filenames`	Returns the list of filenames.
`filepaths`	All the paths for all the files that are part of this split.

Methods

`from_proto`

View source

@classmethod
from_proto(
    proto: proto_lib.SplitInfo, filename_template: naming.ShardedFileTemplate
) -> 'SplitInfo'

Returns a SplitInfo class instance from a SplitInfo proto.

`replace`

View source

replace(
    **kwargs
) -> 'SplitInfo'

Returns a copy of the SplitInfo with updated attributes.

`to_proto`

View source

to_proto() -> proto_lib.SplitInfo

Class Variables
filename_template	`None`

tfds.core.SplitInfo Stay organized with collections Save and categorize content based on your preferences.

Attributes

Methods

from_proto

replace

to_proto

Class Variables

tfds.core.SplitInfo

`from_proto`

`replace`

`to_proto`