tfr.data.parse_from_sequence_example
Stay organized with collections
Save and categorize content based on your preferences.
Parses SequenceExample to feature maps.
tfr.data.parse_from_sequence_example(
serialized,
list_size=None,
context_feature_spec=None,
example_feature_spec=None,
size_feature_name=None,
mask_feature_name=None,
shuffle_examples=False,
seed=None
)
The FixedLenFeature
in example_feature_spec
is converted to
FixedLenSequenceFeature
to parse feature_list
in SequenceExample. We keep
track of the non-trivial default_values (e.g., -1 for labels) for features in
example_feature_spec
and use them to replace the parsing defaults of the
SequenceExample (i.e., 0 for numbers and "" for strings). Due to this
complexity, we only allow scalar non-trivial default values for numbers.
When list_size
is None, the 2nd dim of the output Tensors are not fixed and
vary from batch to batch. When list_size
is specified as a positive integer,
truncation or padding is applied so that the 2nd dim of the output Tensors is
the specified list_size
.
Example:
serialized = [
sequence_example {
context {
feature {
key: "query_length"
value { int64_list { value: 3 } }
}
}
feature_lists {
feature_list {
key: "unigrams"
value {
feature { bytes_list { value: "tensorflow" } }
feature { bytes_list { value: ["learning" "to" "rank"] } }
}
}
feature_list {
key: "utility"
value {
feature { float_list { value: 0.0 } }
feature { float_list { value: 1.0 } }
}
}
}
}
sequence_example {
context {
feature {
key: "query_length"
value { int64_list { value: 2 } }
}
}
feature_lists {
feature_list {
key: "unigrams"
value {
feature { bytes_list { value: "gbdt" } }
feature { }
}
}
feature_list {
key: "utility"
value {
feature { float_list { value: 0.0 } }
feature { float_list { value: 0.0 } }
}
}
}
}
]
We can use arguments:
context_feature_spec: {
"query_length": tf.io.FixedLenFeature([1], dtypes.int64)
}
example_feature_spec: {
"unigrams": tf.io.VarLenFeature(dtypes.string),
"utility": tf.io.FixedLenFeature([1], dtypes.float32,
default_value=[0.])
}
And the expected output is:
{
"unigrams": SparseTensor(
indices=array([[0, 0, 0], [0, 1, 0], [0, 1, 1], [0, 1, 2], [1, 0, 0], [1,
1, 0], [1, 1, 1]]),
values=["tensorflow", "learning", "to", "rank", "gbdt"],
dense_shape=array([2, 2, 3])),
"utility": [[[ 0.], [ 1.]], [[ 0.], [ 0.]]],
"query_length": [[3], [2]],
}
Args |
serialized
|
(Tensor) A string Tensor for a batch of serialized
SequenceExample.
|
list_size
|
(int) The number of frames to keep for a SequenceExample. If
specified, truncation or padding may happen. Otherwise, the output Tensors
have a dynamic list size.
|
context_feature_spec
|
(dict) A mapping from feature keys to
FixedLenFeature or VarLenFeature values for context.
|
example_feature_spec
|
(dict) A mapping from feature keys to
FixedLenFeature or VarLenFeature values for the list of examples.
These features are stored in the feature_lists field in SequenceExample.
FixedLenFeature is translated to FixedLenSequenceFeature to parse
SequenceExample. Note that no missing value in the middle of a
feature_list is allowed for frames.
|
size_feature_name
|
(str) Name of feature for example list sizes. Populates
the feature dictionary with a tf.int32 Tensor of shape [batch_size] for
this feature name. If None, which is default, this feature is not
generated.
|
mask_feature_name
|
(str) Name of feature for example list masks. Populates
the feature dictionary with a tf.bool Tensor of shape [batch_size,
list_size] for this feature name. If None, which is default, this feature
is not generated.
|
shuffle_examples
|
(bool) A boolean to indicate whether examples within a
list are shuffled before the list is trimmed down to list_size elements
(when list has more than list_size elements).
|
seed
|
(int) A seed passed onto random_ops.uniform() to shuffle examples.
|
Returns |
A mapping from feature keys to Tensor or SparseTensor .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-08-18 UTC.
[null,null,["Last updated 2023-08-18 UTC."],[],[],null,["# tfr.data.parse_from_sequence_example\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/ranking/blob/v0.5.3/tensorflow_ranking/python/data.py#L713-L854) |\n\nParses SequenceExample to feature maps. \n\n tfr.data.parse_from_sequence_example(\n serialized,\n list_size=None,\n context_feature_spec=None,\n example_feature_spec=None,\n size_feature_name=None,\n mask_feature_name=None,\n shuffle_examples=False,\n seed=None\n )\n\nThe `FixedLenFeature` in `example_feature_spec` is converted to\n`FixedLenSequenceFeature` to parse `feature_list` in SequenceExample. We keep\ntrack of the non-trivial default_values (e.g., -1 for labels) for features in\n`example_feature_spec` and use them to replace the parsing defaults of the\nSequenceExample (i.e., 0 for numbers and \"\" for strings). Due to this\ncomplexity, we only allow scalar non-trivial default values for numbers.\n\nWhen `list_size` is None, the 2nd dim of the output Tensors are not fixed and\nvary from batch to batch. When `list_size` is specified as a positive integer,\ntruncation or padding is applied so that the 2nd dim of the output Tensors is\nthe specified `list_size`.\n\n#### Example:\n\n serialized = [\n sequence_example {\n context {\n feature {\n key: \"query_length\"\n value { int64_list { value: 3 } }\n }\n }\n feature_lists {\n feature_list {\n key: \"unigrams\"\n value {\n feature { bytes_list { value: \"tensorflow\" } }\n feature { bytes_list { value: [\"learning\" \"to\" \"rank\"] } }\n }\n }\n feature_list {\n key: \"utility\"\n value {\n feature { float_list { value: 0.0 } }\n feature { float_list { value: 1.0 } }\n }\n }\n }\n }\n sequence_example {\n context {\n feature {\n key: \"query_length\"\n value { int64_list { value: 2 } }\n }\n }\n feature_lists {\n feature_list {\n key: \"unigrams\"\n value {\n feature { bytes_list { value: \"gbdt\" } }\n feature { }\n }\n }\n feature_list {\n key: \"utility\"\n value {\n feature { float_list { value: 0.0 } }\n feature { float_list { value: 0.0 } }\n }\n }\n }\n }\n ]\n\n#### We can use arguments:\n\n context_feature_spec: {\n \"query_length\": tf.io.FixedLenFeature([1], dtypes.int64)\n }\n example_feature_spec: {\n \"unigrams\": tf.io.VarLenFeature(dtypes.string),\n \"utility\": tf.io.FixedLenFeature([1], dtypes.float32,\n default_value=[0.])\n }\n\nAnd the expected output is: \n\n {\n \"unigrams\": SparseTensor(\n indices=array([[0, 0, 0], [0, 1, 0], [0, 1, 1], [0, 1, 2], [1, 0, 0], [1,\n 1, 0], [1, 1, 1]]),\n values=[\"tensorflow\", \"learning\", \"to\", \"rank\", \"gbdt\"],\n dense_shape=array([2, 2, 3])),\n \"utility\": [[[ 0.], [ 1.]], [[ 0.], [ 0.]]],\n \"query_length\": [[3], [2]],\n }\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `serialized` | (Tensor) A string Tensor for a batch of serialized SequenceExample. |\n| `list_size` | (int) The number of frames to keep for a SequenceExample. If specified, truncation or padding may happen. Otherwise, the output Tensors have a dynamic list size. |\n| `context_feature_spec` | (dict) A mapping from feature keys to `FixedLenFeature` or `VarLenFeature` values for context. |\n| `example_feature_spec` | (dict) A mapping from feature keys to `FixedLenFeature` or `VarLenFeature` values for the list of examples. These features are stored in the `feature_lists` field in SequenceExample. `FixedLenFeature` is translated to `FixedLenSequenceFeature` to parse SequenceExample. Note that no missing value in the middle of a `feature_list` is allowed for frames. |\n| `size_feature_name` | (str) Name of feature for example list sizes. Populates the feature dictionary with a [`tf.int32`](https://www.tensorflow.org/api_docs/python/tf#int32) Tensor of shape \\[batch_size\\] for this feature name. If None, which is default, this feature is not generated. |\n| `mask_feature_name` | (str) Name of feature for example list masks. Populates the feature dictionary with a [`tf.bool`](https://www.tensorflow.org/api_docs/python/tf#bool) Tensor of shape \\[batch_size, list_size\\] for this feature name. If None, which is default, this feature is not generated. |\n| `shuffle_examples` | (bool) A boolean to indicate whether examples within a list are shuffled before the list is trimmed down to list_size elements (when list has more than list_size elements). |\n| `seed` | (int) A seed passed onto random_ops.uniform() to shuffle examples. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A mapping from feature keys to `Tensor` or `SparseTensor`. ||\n\n\u003cbr /\u003e"]]