If users keep data in tf.Example format, they need to call tf.parse_example
with a proper feature spec. There are two main things that this utility helps:
Users need to combine parsing spec of features with labels and weights
(if any) since they are all parsed from same tf.Example instance. This
utility combines these specs.
It is difficult to map expected label by a classifier such as
DNNClassifier to corresponding tf.parse_example spec. This utility encodes
it by getting related information from users (key, dtype).
Example output of parsing spec:
# Define features and transformationsfeature_b=tf.feature_column.numeric_column(...)feature_c_bucketized=tf.feature_column.bucketized_column(tf.feature_column.numeric_column("feature_c"),...)feature_a_x_feature_c=tf.feature_column.crossed_column(columns=["feature_a",feature_c_bucketized],...)feature_columns=[feature_b,feature_c_bucketized,feature_a_x_feature_c]parsing_spec=tf.estimator.classifier_parse_example_spec(feature_columns,label_key='my-label',label_dtype=tf.string)# For the above example, classifier_parse_example_spec would return the dict:assertparsing_spec=={"feature_a":parsing_ops.VarLenFeature(tf.string),"feature_b":parsing_ops.FixedLenFeature([1],dtype=tf.float32),"feature_c":parsing_ops.FixedLenFeature([1],dtype=tf.float32)"my-label":parsing_ops.FixedLenFeature([1],dtype=tf.string)}
Example usage with a classifier:
feature_columns=# define features via tf.feature_columnestimator=DNNClassifier(n_classes=1000,feature_columns=feature_columns,weight_column='example-weight',label_vocabulary=['photos','keep',...],hidden_units=[256,64,16])# This label configuration tells the classifier the following:# * weights are retrieved with key 'example-weight'# * label is string and can be one of the following ['photos', 'keep', ...]# * integer id for label 'photos' is 0, 'keep' is 1, ...# Input buildersdefinput_fn_train():# Returns a tuple of features and labels.features=tf.contrib.learn.read_keyed_batch_features(file_pattern=train_files,batch_size=batch_size,# creates parsing configuration for tf.parse_examplefeatures=tf.estimator.classifier_parse_example_spec(feature_columns,label_key='my-label',label_dtype=tf.string,weight_column='example-weight'),reader=tf.RecordIOReader)labels=features.pop('my-label')returnfeatures,labelsestimator.train(input_fn=input_fn_train)
Args
feature_columns
An iterable containing all feature columns. All items
should be instances of classes derived from FeatureColumn.
label_key
A string identifying the label. It means tf.Example stores labels
with this key.
label_dtype
A tf.dtype identifies the type of labels. By default it is
tf.int64. If user defines a label_vocabulary, this should be set as
tf.string. tf.float32 labels are only supported for binary
classification.
label_default
used as label if label_key does not exist in given
tf.Example. An example usage: let's say label_key is 'clicked' and
tf.Example contains clicked data only for positive examples in following
format key:clicked, value:1. This means that if there is no data with
key 'clicked' it should count as negative example by setting
label_deafault=0. Type of this value should be compatible with
label_dtype.
weight_column
A string or a NumericColumn created by
tf.feature_column.numeric_column defining feature column representing
weights. It is used to down weight or boost examples during training. It
will be multiplied by the loss of the example. If it is a string, it is
used as a key to fetch weight tensor from the features. If it is a
NumericColumn, raw tensor is fetched by key weight_column.key, then
weight_column.normalizer_fn is applied on it to get weight tensor.
Returns
A dict mapping each feature key to a FixedLenFeature or VarLenFeature
value.
Raises
ValueError
If label is used in feature_columns.
ValueError
If weight_column is used in feature_columns.
ValueError
If any of the given feature_columns is not a _FeatureColumn
instance.
[null,null,["Last updated 2023-03-27 UTC."],[],[],null,["# tf.estimator.classifier_parse_example_spec\n\n\u003cbr /\u003e\n\n|-----------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/estimator/tree/master/tensorflow_estimator/python/estimator/canned/parsing_utils.py#L27-L144) |\n\nGenerates parsing spec for tf.parse_example to be used with classifiers. (deprecated) \n\n tf.estimator.classifier_parse_example_spec(\n feature_columns,\n label_key,\n label_dtype=../../tf/dtypes#int64,\n label_default=None,\n weight_column=None\n )\n\n| **Deprecated:** THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use tf.keras instead.\n\nIf users keep data in tf.Example format, they need to call tf.parse_example\nwith a proper feature spec. There are two main things that this utility helps:\n\n- Users need to combine parsing spec of features with labels and weights (if any) since they are all parsed from same tf.Example instance. This utility combines these specs.\n- It is difficult to map expected label by a classifier such as `DNNClassifier` to corresponding tf.parse_example spec. This utility encodes it by getting related information from users (key, dtype).\n\nExample output of parsing spec: \n\n # Define features and transformations\n feature_b = tf.feature_column.numeric_column(...)\n feature_c_bucketized = tf.feature_column.bucketized_column(\n tf.feature_column.numeric_column(\"feature_c\"), ...)\n feature_a_x_feature_c = tf.feature_column.crossed_column(\n columns=[\"feature_a\", feature_c_bucketized], ...)\n\n feature_columns = [feature_b, feature_c_bucketized, feature_a_x_feature_c]\n parsing_spec = tf.estimator.classifier_parse_example_spec(\n feature_columns, label_key='my-label', label_dtype=tf.string)\n\n # For the above example, classifier_parse_example_spec would return the dict:\n assert parsing_spec == {\n \"feature_a\": parsing_ops.VarLenFeature(tf.string),\n \"feature_b\": parsing_ops.FixedLenFeature([1], dtype=tf.float32),\n \"feature_c\": parsing_ops.FixedLenFeature([1], dtype=tf.float32)\n \"my-label\" : parsing_ops.FixedLenFeature([1], dtype=tf.string)\n }\n\nExample usage with a classifier: \n\n feature_columns = # define features via tf.feature_column\n estimator = DNNClassifier(\n n_classes=1000,\n feature_columns=feature_columns,\n weight_column='example-weight',\n label_vocabulary=['photos', 'keep', ...],\n hidden_units=[256, 64, 16])\n # This label configuration tells the classifier the following:\n # * weights are retrieved with key 'example-weight'\n # * label is string and can be one of the following ['photos', 'keep', ...]\n # * integer id for label 'photos' is 0, 'keep' is 1, ...\n\n\n # Input builders\n def input_fn_train(): # Returns a tuple of features and labels.\n features = tf.contrib.learn.read_keyed_batch_features(\n file_pattern=train_files,\n batch_size=batch_size,\n # creates parsing configuration for tf.parse_example\n features=tf.estimator.classifier_parse_example_spec(\n feature_columns,\n label_key='my-label',\n label_dtype=tf.string,\n weight_column='example-weight'),\n reader=tf.RecordIOReader)\n labels = features.pop('my-label')\n return features, labels\n\n estimator.train(input_fn=input_fn_train)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `feature_columns` | An iterable containing all feature columns. All items should be instances of classes derived from `FeatureColumn`. |\n| `label_key` | A string identifying the label. It means tf.Example stores labels with this key. |\n| `label_dtype` | A `tf.dtype` identifies the type of labels. By default it is [`tf.int64`](../../tf#int64). If user defines a `label_vocabulary`, this should be set as [`tf.string`](../../tf#string). [`tf.float32`](../../tf#float32) labels are only supported for binary classification. |\n| `label_default` | used as label if label_key does not exist in given tf.Example. An example usage: let's say `label_key` is 'clicked' and tf.Example contains clicked data only for positive examples in following format `key:clicked, value:1`. This means that if there is no data with key 'clicked' it should count as negative example by setting `label_deafault=0`. Type of this value should be compatible with `label_dtype`. |\n| `weight_column` | A string or a `NumericColumn` created by [`tf.feature_column.numeric_column`](../../tf/feature_column/numeric_column) defining feature column representing weights. It is used to down weight or boost examples during training. It will be multiplied by the loss of the example. If it is a string, it is used as a key to fetch weight tensor from the `features`. If it is a `NumericColumn`, raw tensor is fetched by key `weight_column.key`, then weight_column.normalizer_fn is applied on it to get weight tensor. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A dict mapping each feature key to a `FixedLenFeature` or `VarLenFeature` value. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|--------------|---------------------------------------------------------------------------|\n| `ValueError` | If label is used in `feature_columns`. |\n| `ValueError` | If weight_column is used in `feature_columns`. |\n| `ValueError` | If any of the given `feature_columns` is not a `_FeatureColumn` instance. |\n| `ValueError` | If `weight_column` is not a `NumericColumn` instance. |\n| `ValueError` | if label_key is None. |\n\n\u003cbr /\u003e"]]