tfds.split_for_jax_process
Stay organized with collections
Save and categorize content based on your preferences.
Returns the subsplit of the data for the process.
tfds.split_for_jax_process(
split: str,
*,
process_index: tfds.typing.Dim
= None,
process_count: tfds.typing.Dim
= None,
drop_remainder: bool = False
) -> tfds.typing.SplitArg
In distributed setting, all process/hosts should get a non-overlapping,
equally sized slice of the entire data. This function takes as input a split
and extracts the slice for the current process index.
Usage:
tfds.load(..., split=tfds.split_for_jax_process('train'))
This funtion is an alias for:
tfds.even_splits(split, n=jax.process_count())[jax.process_index()]
By default, if examples can't be evenly distributed across processes, you can
drop extra examples with drop_remainder=True
.
Args |
split
|
Split to distribute across host (e.g. train[75%:] ,
train[:800]+validation[:100] ).
|
process_index
|
Process index in [0, count) . Defaults to
jax.process_index() .
|
process_count
|
Number of processes. Defaults to jax.process_count() .
|
drop_remainder
|
Drop examples if the number of examples in the datasets is
not evenly divisible by n . If False , examples are distributed evenly
across subsplits, starting by the first. For example, if there is 11
examples with n=3 , splits will contain [4, 4, 3] examples
respectivelly.
|
Returns |
subsplit
|
The sub-split of the given split for the current
process_index .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[],null,["# tfds.split_for_jax_process\n\n\u003cbr /\u003e\n\n|-----------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/datasets/blob/v4.9.3/tensorflow_datasets/core/subsplits_utils.py#L134-L186) |\n\nReturns the subsplit of the data for the process. \n\n tfds.split_for_jax_process(\n split: str,\n *,\n process_index: ../tfds/typing/Dim = None,\n process_count: ../tfds/typing/Dim = None,\n drop_remainder: bool = False\n ) -\u003e ../tfds/typing/SplitArg\n\nIn distributed setting, all process/hosts should get a non-overlapping,\nequally sized slice of the entire data. This function takes as input a split\nand extracts the slice for the current process index.\n\n#### Usage:\n\n tfds.load(..., split=tfds.split_for_jax_process('train'))\n\nThis funtion is an alias for: \n\n tfds.even_splits(split, n=jax.process_count())[jax.process_index()]\n\nBy default, if examples can't be evenly distributed across processes, you can\ndrop extra examples with `drop_remainder=True`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `split` | Split to distribute across host (e.g. `train[75%:]`, `train[:800]+validation[:100]`). |\n| `process_index` | Process index in `[0, count)`. Defaults to `jax.process_index()`. |\n| `process_count` | Number of processes. Defaults to `jax.process_count()`. |\n| `drop_remainder` | Drop examples if the number of examples in the datasets is not evenly divisible by `n`. If `False`, examples are distributed evenly across subsplits, starting by the first. For example, if there is 11 examples with `n=3`, splits will contain `[4, 4, 3]` examples respectivelly. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|------------|---------------------------------------------------------------------|\n| `subsplit` | The sub-split of the given `split` for the current `process_index`. |\n\n\u003cbr /\u003e"]]