View source on GitHub
|
An Example is a mostly-normalized data format for storing data for training and inference.
It contains a key-value store features where each key (string) maps to a
tf.train.Feature message. This flexible and compact format allows the
storage of large amounts of typed data, but requires that the data shape
and use be determined by the configuration files and parsers that are used to
read and write this format.
In TensorFlow, Examples are read in row-major
format, so any configuration that describes data with rank-2 or above
should keep this in mind. For example, to store an M x N matrix of bytes,
the tf.train.BytesList must contain M*N bytes, with M rows of N contiguous values
each. That is, the BytesList value must store the matrix as:
.... row 0 .... // .... row 1 .... // ........... // ... row M-1 ....
An Example for a movie recommendation application:
features {
feature {
key: "age"
value { float_list {
value: 29.0
} }
}
feature {
key: "movie"
value { bytes_list {
value: "The Shawshank Redemption"
value: "Fight Club"
} }
}
feature {
key: "movie_ratings"
value { float_list {
value: 9.0
value: 9.7
} }
}
feature {
key: "suggestion"
value { bytes_list {
value: "Inception"
} }
}
Note:that this feature exists to be used as a label in training.
# E.g., if training a logistic regression model to predict purchase
# probability in our learning tool we would set the label feature to
# "suggestion_purchased".
feature {
key: "suggestion_purchased"
value { float_list {
value: 1.0
} }
}
# Similar to "suggestion_purchased" above this feature exists to be used
# as a label in training.
# E.g., if training a linear regression model to predict purchase
# price in our learning tool we would set the label feature to
# "purchase_price".
feature {
key: "purchase_price"
value { float_list {
value: 9.99
} }
}
}
A conformant Example dataset obeys the following conventions:
- If a Feature
Kexists in one example with data typeT, it must be of typeTin all other examples when present. It may be omitted. - The number of instances of Feature
Klist data may vary across examples, depending on the requirements of the model. - If a Feature
Kdoesn't exist in an example, aK-specific default will be used, if configured. - If a Feature
Kexists in an example but contains no items, the intent is considered to be an empty tensor and no default will be used.
Attributes | |
|---|---|
features
|
Features features
|
View source on GitHub