tf.io.RaggedFeature
Stay organized with collections
Save and categorize content based on your preferences.
Configuration for passing a RaggedTensor input feature.
tf.io.RaggedFeature(
dtype, value_key=None, partitions=(), row_splits_dtype=tf.dtypes.int32,
validate=False
)
value_key
specifies the feature key for a variable-length list of values;
and partitions
specifies zero or more feature keys for partitioning those
values into higher dimensions. Each element of partitions
must be one of
the following:
tf.io.RaggedFeature.RowSplits(key: string)
tf.io.RaggedFeature.RowLengths(key: string)
tf.io.RaggedFeature.RowStarts(key: string)
tf.io.RaggedFeature.RowLimits(key: string)
tf.io.RaggedFeature.ValueRowIds(key: string)
tf.io.RaggedFeature.UniformRowLength(length: int)
.
Where key
is a feature key whose values are used to partition the values.
Partitions are listed from outermost to innermost.
If len(partitions) == 0
(the default), then:
- A feature from a single
tf.Example
is parsed into a 1D tf.Tensor
.
- A feature from a batch of
tf.Example
s is parsed into a 2D
tf.RaggedTensor
, where the outer dimension is the batch dimension, and
the inner (ragged) dimension is the feature length in each example.
If len(partitions) == 1
, then:
A feature from a single tf.Example
is parsed into a 2D
tf.RaggedTensor
, where the values taken from the value_key
are
separated into rows using the partition key.
A feature from a batch of tf.Example
s is parsed into a 3D
tf.RaggedTensor
, where the outer dimension is the batch dimension,
the two inner dimensions are formed by separating the value_key
values
from each example into rows using that example's partition key.
If len(partitions) > 1
, then:
A feature from a single tf.Example
is parsed into a tf.RaggedTensor
whose rank is len(partitions)+1
, and whose ragged_rank is
len(partitions)
.
A feature from a batch of tf.Example
s is parsed into a tf.RaggedTensor
whose rank is len(partitions)+2
and whose ragged_rank is
len(partitions)+1
, where the outer dimension is the batch dimension.
There is one exception: if the final (i.e., innermost) element(s) of
partitions
are UniformRowLength
s, then the values are simply reshaped (as
a higher-dimensional tf.Tensor
), rather than being wrapped in a
tf.RaggedTensor
.
Examples
import google.protobuf.text_format as pbtext
example_batch = [
pbtext.Merge(r'''
features {
feature {key: "v" value {int64_list {value: [3, 1, 4, 1, 5, 9]} } }
feature {key: "s1" value {int64_list {value: [0, 2, 3, 3, 6]} } }
feature {key: "s2" value {int64_list {value: [0, 2, 3, 4]} } }
}''', tf.train.Example()).SerializeToString(),
pbtext.Merge(r'''
features {
feature {key: "v" value {int64_list {value: [2, 7, 1, 8, 2, 8, 1]} } }
feature {key: "s1" value {int64_list {value: [0, 3, 4, 5, 7]} } }
feature {key: "s2" value {int64_list {value: [0, 1, 1, 4]} } }
}''', tf.train.Example()).SerializeToString()]
features = {
# Zero partitions: returns 1D tf.Tensor for each Example.
'f1': tf.io.RaggedFeature(value_key="v", dtype=tf.int64),
# One partition: returns 2D tf.RaggedTensor for each Example.
'f2': tf.io.RaggedFeature(value_key="v", dtype=tf.int64, partitions=[
tf.io.RaggedFeature.RowSplits("s1")]),
# Two partitions: returns 3D tf.RaggedTensor for each Example.
'f3': tf.io.RaggedFeature(value_key="v", dtype=tf.int64, partitions=[
tf.io.RaggedFeature.RowSplits("s2"),
tf.io.RaggedFeature.RowSplits("s1")])
}
feature_dict = tf.io.parse_single_example(example_batch[0], features)
for (name, val) in sorted(feature_dict.items()):
print('%s: %s' % (name, val))
f1: tf.Tensor([3 1 4 1 5 9], shape=(6,), dtype=int64)
f2: <tf.RaggedTensor [[3, 1], [4], [], [1, 5, 9]]>
f3: <tf.RaggedTensor [[[3, 1], [4]], [[]], [[1, 5, 9]]]>
feature_dict = tf.io.parse_example(example_batch, features)
for (name, val) in sorted(feature_dict.items()):
print('%s: %s' % (name, val))
f1: <tf.RaggedTensor [[3, 1, 4, 1, 5, 9],
[2, 7, 1, 8, 2, 8, 1]]>
f2: <tf.RaggedTensor [[[3, 1], [4], [], [1, 5, 9]],
[[2, 7, 1], [8], [2], [8, 1]]]>
f3: <tf.RaggedTensor [[[[3, 1], [4]], [[]], [[1, 5, 9]]],
[[[2, 7, 1]], [], [[8], [2], [8, 1]]]]>
Fields:
dtype
: Data type of the RaggedTensor
. Must be one of:
tf.dtypes.int64
, tf.dtypes.float32
, tf.dtypes.string
.
value_key
: (Optional.) Key for a Feature
in the input Example
, whose
parsed Tensor
will be the resulting RaggedTensor.flat_values
. If
not specified, then it defaults to the key for this RaggedFeature
.
partitions
: (Optional.) A list of objects specifying the row-partitioning
tensors (from outermost to innermost). Each entry in this list must be
one of:
tf.io.RaggedFeature.RowSplits(key: string)
tf.io.RaggedFeature.RowLengths(key: string)
tf.io.RaggedFeature.RowStarts(key: string)
tf.io.RaggedFeature.RowLimits(key: string)
tf.io.RaggedFeature.ValueRowIds(key: string)
tf.io.RaggedFeature.UniformRowLength(length: int)
.
Where key
is a key for a Feature
in the input Example
, whose parsed
Tensor
will be the resulting row-partitioning tensor.
row_splits_dtype
: (Optional.) Data type for the row-partitioning tensor(s).
One of int32
or int64
. Defaults to int32
.
validate
: (Optional.) Boolean indicating whether or not to validate that
the input values form a valid RaggedTensor. Defaults to False
.
Attributes |
dtype
|
|
value_key
|
|
partitions
|
|
row_splits_dtype
|
|
validate
|
|
Child Classes
class RowLengths
class RowLimits
class RowSplits
class RowStarts
class UniformRowLength
class ValueRowIds
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[null,null,["Last updated 2020-10-01 UTC."],[],[],null,["# tf.io.RaggedFeature\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.1.0/tensorflow/python/ops/parsing_config.py#L67-L229) |\n\nConfiguration for passing a RaggedTensor input feature.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.io.RaggedFeature`](/api_docs/python/tf/io/RaggedFeature)\n\n\u003cbr /\u003e\n\n tf.io.RaggedFeature(\n dtype, value_key=None, partitions=(), row_splits_dtype=tf.dtypes.int32,\n validate=False\n )\n\n`value_key` specifies the feature key for a variable-length list of values;\nand `partitions` specifies zero or more feature keys for partitioning those\nvalues into higher dimensions. Each element of `partitions` must be one of\nthe following:\n\n- `tf.io.RaggedFeature.RowSplits(key: string)`\n- `tf.io.RaggedFeature.RowLengths(key: string)`\n- `tf.io.RaggedFeature.RowStarts(key: string)`\n- `tf.io.RaggedFeature.RowLimits(key: string)`\n- `tf.io.RaggedFeature.ValueRowIds(key: string)`\n- `tf.io.RaggedFeature.UniformRowLength(length: int)`.\n\nWhere `key` is a feature key whose values are used to partition the values.\nPartitions are listed from outermost to innermost.\n\n- If `len(partitions) == 0` (the default), then:\n\n - A feature from a single `tf.Example` is parsed into a 1D [`tf.Tensor`](../../tf/Tensor).\n - A feature from a batch of `tf.Example`s is parsed into a 2D [`tf.RaggedTensor`](../../tf/RaggedTensor), where the outer dimension is the batch dimension, and the inner (ragged) dimension is the feature length in each example.\n- If `len(partitions) == 1`, then:\n\n - A feature from a single `tf.Example` is parsed into a 2D\n [`tf.RaggedTensor`](../../tf/RaggedTensor), where the values taken from the `value_key` are\n separated into rows using the partition key.\n\n - A feature from a batch of `tf.Example`s is parsed into a 3D\n [`tf.RaggedTensor`](../../tf/RaggedTensor), where the outer dimension is the batch dimension,\n the two inner dimensions are formed by separating the `value_key` values\n from each example into rows using that example's partition key.\n\n- If `len(partitions) \u003e 1`, then:\n\n - A feature from a single `tf.Example` is parsed into a [`tf.RaggedTensor`](../../tf/RaggedTensor)\n whose rank is `len(partitions)+1`, and whose ragged_rank is\n `len(partitions)`.\n\n - A feature from a batch of `tf.Example`s is parsed into a [`tf.RaggedTensor`](../../tf/RaggedTensor)\n whose rank is `len(partitions)+2` and whose ragged_rank is\n `len(partitions)+1`, where the outer dimension is the batch dimension.\n\nThere is one exception: if the final (i.e., innermost) element(s) of\n`partitions` are `UniformRowLength`s, then the values are simply reshaped (as\na higher-dimensional [`tf.Tensor`](../../tf/Tensor)), rather than being wrapped in a\n[`tf.RaggedTensor`](../../tf/RaggedTensor).\n\n#### Examples\n\n import google.protobuf.text_format as pbtext\n example_batch = [\n pbtext.Merge(r'''\n features {\n feature {key: \"v\" value {int64_list {value: [3, 1, 4, 1, 5, 9]} } }\n feature {key: \"s1\" value {int64_list {value: [0, 2, 3, 3, 6]} } }\n feature {key: \"s2\" value {int64_list {value: [0, 2, 3, 4]} } }\n }''', tf.train.Example()).SerializeToString(),\n pbtext.Merge(r'''\n features {\n feature {key: \"v\" value {int64_list {value: [2, 7, 1, 8, 2, 8, 1]} } }\n feature {key: \"s1\" value {int64_list {value: [0, 3, 4, 5, 7]} } }\n feature {key: \"s2\" value {int64_list {value: [0, 1, 1, 4]} } }\n }''', tf.train.Example()).SerializeToString()]\n\n features = {\n # Zero partitions: returns 1D tf.Tensor for each Example.\n 'f1': tf.io.RaggedFeature(value_key=\"v\", dtype=tf.int64),\n # One partition: returns 2D tf.RaggedTensor for each Example.\n 'f2': tf.io.RaggedFeature(value_key=\"v\", dtype=tf.int64, partitions=[\n tf.io.RaggedFeature.RowSplits(\"s1\")]),\n # Two partitions: returns 3D tf.RaggedTensor for each Example.\n 'f3': tf.io.RaggedFeature(value_key=\"v\", dtype=tf.int64, partitions=[\n tf.io.RaggedFeature.RowSplits(\"s2\"),\n tf.io.RaggedFeature.RowSplits(\"s1\")])\n }\n\n feature_dict = tf.io.parse_single_example(example_batch[0], features)\n for (name, val) in sorted(feature_dict.items()):\n print('%s: %s' % (name, val))\n f1: tf.Tensor([3 1 4 1 5 9], shape=(6,), dtype=int64)\n f2: \u003ctf.RaggedTensor [[3, 1], [4], [], [1, 5, 9]]\u003e\n f3: \u003ctf.RaggedTensor [[[3, 1], [4]], [[]], [[1, 5, 9]]]\u003e\n\n feature_dict = tf.io.parse_example(example_batch, features)\n for (name, val) in sorted(feature_dict.items()):\n print('%s: %s' % (name, val))\n f1: \u003ctf.RaggedTensor [[3, 1, 4, 1, 5, 9],\n [2, 7, 1, 8, 2, 8, 1]]\u003e\n f2: \u003ctf.RaggedTensor [[[3, 1], [4], [], [1, 5, 9]],\n [[2, 7, 1], [8], [2], [8, 1]]]\u003e\n f3: \u003ctf.RaggedTensor [[[[3, 1], [4]], [[]], [[1, 5, 9]]],\n [[[2, 7, 1]], [], [[8], [2], [8, 1]]]]\u003e\n\n#### Fields:\n\n- **`dtype`** : Data type of the `RaggedTensor`. Must be one of: [`tf.dtypes.int64`](../../tf/dtypes#int64), [`tf.dtypes.float32`](../../tf/dtypes#float32), [`tf.dtypes.string`](../../tf/dtypes#string).\n- **`value_key`** : (Optional.) Key for a `Feature` in the input `Example`, whose parsed `Tensor` will be the resulting [`RaggedTensor.flat_values`](../../tf/RaggedTensor#flat_values). If not specified, then it defaults to the key for this `RaggedFeature`.\n- **`partitions`** : (Optional.) A list of objects specifying the row-partitioning tensors (from outermost to innermost). Each entry in this list must be one of:\n - `tf.io.RaggedFeature.RowSplits(key: string)`\n - `tf.io.RaggedFeature.RowLengths(key: string)`\n - `tf.io.RaggedFeature.RowStarts(key: string)`\n - `tf.io.RaggedFeature.RowLimits(key: string)`\n - `tf.io.RaggedFeature.ValueRowIds(key: string)`\n - `tf.io.RaggedFeature.UniformRowLength(length: int)`. Where `key` is a key for a `Feature` in the input `Example`, whose parsed `Tensor` will be the resulting row-partitioning tensor.\n- **`row_splits_dtype`** : (Optional.) Data type for the row-partitioning tensor(s). One of `int32` or `int64`. Defaults to `int32`.\n- **`validate`** : (Optional.) Boolean indicating whether or not to validate that the input values form a valid RaggedTensor. Defaults to `False`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|--------------------|---------------|\n| `dtype` | \u003cbr /\u003e \u003cbr /\u003e |\n| `value_key` | \u003cbr /\u003e \u003cbr /\u003e |\n| `partitions` | \u003cbr /\u003e \u003cbr /\u003e |\n| `row_splits_dtype` | \u003cbr /\u003e \u003cbr /\u003e |\n| `validate` | \u003cbr /\u003e \u003cbr /\u003e |\n\n\u003cbr /\u003e\n\nChild Classes\n-------------\n\n[`class RowLengths`](../../tf/io/RaggedFeature/RowLengths)\n\n[`class RowLimits`](../../tf/io/RaggedFeature/RowLimits)\n\n[`class RowSplits`](../../tf/io/RaggedFeature/RowSplits)\n\n[`class RowStarts`](../../tf/io/RaggedFeature/RowStarts)\n\n[`class UniformRowLength`](../../tf/io/RaggedFeature/UniformRowLength)\n\n[`class ValueRowIds`](../../tf/io/RaggedFeature/ValueRowIds)"]]