A Dataset of fixed-length records from one or more binary files.

Inherits From: Dataset

The reads fixed length records from binary files and creates a dataset where each record becomes an element of the dataset. The binary files can have a fixed length header and a fixed length footer, which will both be skipped.

For example, suppose we have 2 files "fixed_length0.bin" and "fixed_length1.bin" with the following content:

with open('/tmp/fixed_length0.bin', 'wb') as f:
with open('/tmp/fixed_length1.bin', 'wb') as f:

We can construct a FixedLengthRecordDataset from them as follows:

dataset1 =
    filenames=['/tmp/fixed_length0.bin', '/tmp/fixed_length1.bin'],
    record_bytes=2, header_bytes=6, footer_bytes=6)

The elements of the dataset are:

for element in dataset1.as_numpy_iterator():

filenames A tf.string tensor or containing one or more filenames.
record_bytes A tf.int64 scalar representing the number of bytes in each record.
header_bytes (Optional.) A tf.int64 scalar representing the number of bytes to skip at the start of a file.
footer_bytes (Optional.) A tf.int64 scalar representing the number of bytes to ignore at the end of a file.
buffer_size (Optional.) A tf.int64 scalar representing the number of bytes to buffer when reading.
compression_type (Optional.) A tf.string scalar evaluating to one of "" (no compression), "ZLIB", or "GZIP".
num_parallel_reads (Optional.) A tf.int64 scalar representing the number of files to read in parallel. If greater than one, the records of files read in parallel are outputted in an interleaved order. If your input pipeline is I/O bottlenecked, consider setting this parameter to a value greater than one to parallelize the I/O. If None, files will be read sequentially.
name (Optional.) A name for the operation.

element_spec The type specification of an element of this dataset.

dataset =[1, 2, 3])
TensorSpec(shape=(), dtype=tf.int32, name=None)

For more information, read this guide.



Applies a transformation function to this dataset.

apply enables chaining of custom Dataset transformations, which are represented as functions that take one Dataset