tfr.data.build_ranking_dataset_with_parsing_fn

Builds a ranking tf.dataset using the provided parsing_fn.

tfr.data.build_ranking_dataset_with_parsing_fn(
    file_pattern,
    parsing_fn,
    batch_size,
    reader=tfr.keras.pipeline.DatasetHparams.dataset_reader,
    reader_args=None,
    num_epochs=None,
    shuffle=True,
    shuffle_buffer_size=10000,
    shuffle_seed=None,
    prefetch_buffer_size=tf.data.experimental.AUTOTUNE,
    reader_num_threads=tf.data.experimental.AUTOTUNE,
    sloppy_ordering=False,
    drop_final_batch=False,
    num_parser_threads=tf.data.experimental.AUTOTUNE,
    from_file_list=False
)

Args
`file_pattern`	(str \| list(str)) List of files or patterns of file paths containing serialized data. See `tf.gfile.Glob` for pattern rules.
`parsing_fn`	(function) It has a single argument parsing_fn(serialized). Users can customize this for their own data formats.
`batch_size`	(int) Number of records to combine in a single batch.
`reader`	A function or class that can be called with a `filenames` tensor and (optional) `reader_args` and returns a `Dataset`. Defaults to `tf.data.TFRecordDataset`.
`reader_args`	(list) Additional argument list to pass to the reader class.
`num_epochs`	(int) Number of times to read through the dataset. If None, cycles through the dataset forever. Defaults to `None`.
`shuffle`	(bool) Indicates whether the input should be shuffled. Defaults to `True`.
`shuffle_buffer_size`	(int) Buffer size of the ShuffleDataset. A large capacity ensures better shuffling but would increase memory usage and startup time.
`shuffle_seed`	(int) Randomization seed to use for shuffling.
`prefetch_buffer_size`	(int) Number of feature batches to prefetch in order to improve performance. Recommended value is the number of batches consumed per training step. Defaults to auto-tune.
`reader_num_threads`	(int) Number of threads used to read records. If greater than 1, the results will be interleaved. Defaults to auto-tune.
`sloppy_ordering`	(bool) If `True`, reading performance will be improved at the cost of non-deterministic ordering. If `False`, the order of elements produced is deterministic prior to shuffling (elements are still randomized if `shuffle=True`. Note that if the seed is set, then order of elements after shuffling is deterministic). Defaults to `False`.
`drop_final_batch`	(bool) If `True`, and the batch size does not evenly divide the input dataset size, the final smaller batch will be dropped. Defaults to `False`. If `True`, the batch_size can be statically inferred.
`num_parser_threads`	(int) Optional number of threads to be used with dataset.map() when invoking parsing_fn. Defaults to auto-tune.
`from_file_list`	(bool) If `True`, input file_pattern will be taken as a list of filenames, instead of patten or list of patterns.

Returns
A dataset of `dict` elements. Each `dict` maps feature keys to `Tensor` or `SparseTensor` objects.

tfr.data.build_ranking_dataset_with_parsing_fn Stay organized with collections Save and categorize content based on your preferences.

Args

Returns

tfr.data.build_ranking_dataset_with_parsing_fn