The op extracts fields from a serialized protocol buffers message into tensors.
tf.io.decode_proto( bytes, message_type, field_names, output_types, descriptor_source='local://', message_format='binary', sanitize=False, name=None )
Defined in generated file:
decode_proto op extracts fields from a serialized protocol buffers
message into tensors. The fields in
field_names are decoded and converted
to the corresponding
output_types if possible.
message_type name must be provided to give context for the field
names. The actual message descriptor can be looked up either in the
linked-in descriptor pool or a filename provided by the caller using
Each output tensor is a dense tensor. This means that it is padded to
hold the largest number of repeated elements seen in the input
minibatch. (The shape is also padded by one to prevent zero-sized
dimensions). The actual repeat counts for each example in the
minibatch can be found in the
sizes output. In many cases the output
decode_proto is fed immediately into tf.squeeze if missing values
are not a concern. When using tf.squeeze, always pass the squeeze
dimension explicitly to avoid surprises.
For the most part, the mapping between Proto field types and TensorFlow dtypes is straightforward. However, there are a few special cases:
A proto field that contains a submessage or group can only be converted to
DT_STRING(the serialized submessage). This is to reduce the complexity of the API. The resulting string can be used as input to another instance of the decode_proto op.
TensorFlow lacks support for unsigned integers. The ops represent uint64 types as a
DT_INT64with the same twos-complement bit pattern (the obvious way). Unsigned int32 values can be represented exactly by specifying type
DT_INT64, or using twos-complement if the caller specifies
descriptor_source attribute selects a source of protocol
descriptors to consult when looking up
message_type. This may be a
filename containing a serialized
or the special value
local://, in which case only descriptors linked
into the code will be searched; the filename can be on any filesystem
accessible to TensorFlow.
You can build a
descriptor_source file using the
--include_imports options to the protocol compiler
local:// database only covers descriptors linked into the
code via C++ libraries, not Python imports. You can link in a proto descriptor
by creating a cc_library target with alwayslink=1.
Both binary and text proto serializations are supported, and can be
chosen using the
string. Tensor of serialized protos with shape
string. Name of the proto message type to decode.
field_names: A list of
strings. List of strings containing proto field names. An extension field can be decoded by using its full name, e.g. EXT_PACKAGE.EXT_FIELD_NAME.
output_types: A list of
tf.DTypes. List of TF types to use for the respective field in field_names.
descriptor_source: An optional
string. Defaults to
"local://". Either the special value
local://or a path to a file containing a serialized
message_format: An optional
string. Defaults to
sanitize: An optional
bool. Defaults to
False. Whether to sanitize the result or not.
name: A name for the operation (optional).
A tuple of
Tensor objects (sizes, values).
values: A list of
Tensorobjects of type