tf.data.experimental.dense_to_ragged_batch
Stay organized with collections
Save and categorize content based on your preferences.
A transformation that batches ragged elements into tf.RaggedTensor
s.
tf.data.experimental.dense_to_ragged_batch(
batch_size,
drop_remainder=False,
row_splits_dtype=tf.dtypes.int64
)
This transformation combines multiple consecutive elements of the input
dataset into a single element.
Like tf.data.Dataset.batch
, the components of the resulting element will
have an additional outer dimension, which will be batch_size
(or
N % batch_size
for the last element if batch_size
does not divide the
number of input elements N
evenly and drop_remainder
is False
). If
your program depends on the batches having the same outer dimension, you
should set the drop_remainder
argument to True
to prevent the smaller
batch from being produced.
Unlike tf.data.Dataset.batch
, the input elements to be batched may have
different shapes:
- If an input element is a
tf.Tensor
whose static tf.TensorShape
is
fully defined, then it is batched as normal.
- If an input element is a
tf.Tensor
whose static tf.TensorShape
contains
one or more axes with unknown size (i.e., shape[i]=None
), then the output
will contain a tf.RaggedTensor
that is ragged up to any of such
dimensions.
- If an input element is a
tf.RaggedTensor
or any other type, then it is
batched as normal.
Example:
dataset = tf.data.Dataset.from_tensor_slices(np.arange(6))
dataset = dataset.map(lambda x: tf.range(x))
dataset.element_spec.shape
TensorShape([None])
dataset = dataset.apply(
tf.data.experimental.dense_to_ragged_batch(batch_size=2))
for batch in dataset:
print(batch)
<tf.RaggedTensor [[], [0]]>
<tf.RaggedTensor [[0, 1], [0, 1, 2]]>
<tf.RaggedTensor [[0, 1, 2, 3], [0, 1, 2, 3, 4]]>
Args |
batch_size
|
A tf.int64 scalar tf.Tensor , representing the number of
consecutive elements of this dataset to combine in a single batch.
|
drop_remainder
|
(Optional.) A tf.bool scalar tf.Tensor , representing
whether the last batch should be dropped in the case it has fewer than
batch_size elements; the default behavior is not to drop the smaller
batch.
|
row_splits_dtype
|
The dtype that should be used for the row_splits of any
new ragged tensors. Existing tf.RaggedTensor elements do not have their
row_splits dtype changed.
|
Returns |
Dataset
|
A Dataset .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2023-03-17 UTC.
[null,null,["Last updated 2023-03-17 UTC."],[],[],null,["# tf.data.experimental.dense_to_ragged_batch\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.9.3/tensorflow/python/data/experimental/ops/batching.py#L32-L95) |\n\nA transformation that batches ragged elements into [`tf.RaggedTensor`](../../../tf/RaggedTensor)s.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.data.experimental.dense_to_ragged_batch`](https://www.tensorflow.org/api_docs/python/tf/data/experimental/dense_to_ragged_batch)\n\n\u003cbr /\u003e\n\n tf.data.experimental.dense_to_ragged_batch(\n batch_size,\n drop_remainder=False,\n row_splits_dtype=../../../tf/dtypes#int64\n )\n\nThis transformation combines multiple consecutive elements of the input\ndataset into a single element.\n\nLike [`tf.data.Dataset.batch`](../../../tf/data/Dataset#batch), the components of the resulting element will\nhave an additional outer dimension, which will be `batch_size` (or\n`N % batch_size` for the last element if `batch_size` does not divide the\nnumber of input elements `N` evenly and `drop_remainder` is `False`). If\nyour program depends on the batches having the same outer dimension, you\nshould set the `drop_remainder` argument to `True` to prevent the smaller\nbatch from being produced.\n\nUnlike [`tf.data.Dataset.batch`](../../../tf/data/Dataset#batch), the input elements to be batched may have\ndifferent shapes:\n\n- If an input element is a [`tf.Tensor`](../../../tf/Tensor) whose static [`tf.TensorShape`](../../../tf/TensorShape) is fully defined, then it is batched as normal.\n- If an input element is a [`tf.Tensor`](../../../tf/Tensor) whose static [`tf.TensorShape`](../../../tf/TensorShape) contains one or more axes with unknown size (i.e., `shape[i]=None`), then the output will contain a [`tf.RaggedTensor`](../../../tf/RaggedTensor) that is ragged up to any of such dimensions.\n- If an input element is a [`tf.RaggedTensor`](../../../tf/RaggedTensor) or any other type, then it is batched as normal.\n\n#### Example:\n\n dataset = tf.data.Dataset.from_tensor_slices(np.arange(6))\n dataset = dataset.map(lambda x: tf.range(x))\n dataset.element_spec.shape\n TensorShape([None])\n dataset = dataset.apply(\n tf.data.experimental.dense_to_ragged_batch(batch_size=2))\n for batch in dataset:\n print(batch)\n \u003ctf.RaggedTensor [[], [0]]\u003e\n \u003ctf.RaggedTensor [[0, 1], [0, 1, 2]]\u003e\n \u003ctf.RaggedTensor [[0, 1, 2, 3], [0, 1, 2, 3, 4]]\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `batch_size` | A [`tf.int64`](../../../tf#int64) scalar [`tf.Tensor`](../../../tf/Tensor), representing the number of consecutive elements of this dataset to combine in a single batch. |\n| `drop_remainder` | (Optional.) A [`tf.bool`](../../../tf#bool) scalar [`tf.Tensor`](../../../tf/Tensor), representing whether the last batch should be dropped in the case it has fewer than `batch_size` elements; the default behavior is not to drop the smaller batch. |\n| `row_splits_dtype` | The dtype that should be used for the `row_splits` of any new ragged tensors. Existing [`tf.RaggedTensor`](../../../tf/RaggedTensor) elements do not have their row_splits dtype changed. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|-----------|--------------|\n| `Dataset` | A `Dataset`. |\n\n\u003cbr /\u003e"]]