tft.TFTransformOutput

A wrapper around the output of the tf.Transform.

Used in the notebooks

Used in the tutorials

transform_output_dir The directory containig tf.Transform output.

post_transform_statistics_path Returns the path to the post-transform datum statistics.

pre_transform_statistics_path Returns the path to the pre-transform datum statistics.
raw_metadata A DatasetMetadata.
transform_savedmodel_dir A python str.
transformed_metadata A DatasetMetadata.

Methods

load_transform_graph

View source

Load the transform graph without replacing any placeholders.

This is necessary to ensure that variables in the transform graph are included in the training checkpoint when using tf.Estimator. This should be called in the training input_fn.

num_buckets_for_transformed_feature

View source

Returns the number of buckets for an integerized transformed feature.

raw_domains

View source

Returns domains for the raw features.

Returns
A dict from feature names to schema_pb2.Domain.

raw_feature_spec

View source

Returns a feature_spec for the raw features.

Returns
A dict from feature names to FixedLenFeature/SparseFeature/VarLenFeature.

transform_features_layer

View source

Creates a TransformFeaturesLayer from this transform output.

If a TransformFeaturesLayer has already been created for self, the same one will be returned.

Returns
A TransformFeaturesLayer instance.

transform_raw_features

View source

Takes a dict of tensors representing raw features and transforms them.

Takes a dictionary of Tensors or SparseTensors that represent the raw features, and applies the transformation defined by tf.Transform.

By default it returns all transformed features defined by tf.Transform. To only return features transformed from the given 'raw_features', set drop_unused_features to True.

Args
raw_features A dict whose keys are feature names and values are Tensors or SparseTensors.
drop_unused_features If True, the result will be filtered. Only the features that are transformed from 'raw_features' will be included in the returned result. If a feature is transformed from multiple raw features (e.g, feature cross), it will only be included if all its base raw features are present in raw_features.

Returns
A dict whose keys are feature names and values are Tensors or SparseTensors representing transformed features.

transformed_domains

View source

Returns domains for the transformed features.

Returns
A dict from feature names to schema_pb2.Domain.

transformed_feature_spec

View source

Returns a feature_spec for the transformed features.

Returns
A dict from feature names to FixedLenFeature/SparseFeature/VarLenFeature.

vocabulary_by_name

View source

Like vocabulary_file_by_name but returns a list.

vocabulary_file_by_name

View source

Returns the vocabulary file path created in the preprocessing function.

vocab_filename must be the name used as the vocab_filename argument to tft.compute_and_apply_vocabulary or tft.vocabulary. By convention, this should be the name of the feature that the vocab was computed for, where possible.

Args
vocab_filename The relative filename to lookup.

vocabulary_size_by_name

View source

Like vocabulary_file_by_name, but returns the size of vocabulary.

POST_TRANSFORM_FEATURE_STATS_PATH 'post_transform_feature_stats/FeatureStats.pb'
PRE_TRANSFORM_FEATURE_STATS_PATH 'pre_transform_feature_stats/FeatureStats.pb'
RAW_METADATA_DIR 'metadata'
TRANSFORMED_METADATA_DIR 'transformed_metadata'
TRANSFORM_FN_DIR 'transform_fn'