tft.TFTransformOutput

A wrapper around the output of the tf.Transform.

tft.TFTransformOutput(
    transform_output_dir: str
)

Used in the notebooks

Used in the tutorials
Preprocessing data with TensorFlow Transform Preprocess data with TensorFlow Transform TFX Estimator Component Tutorial TFX Keras Component Tutorial Graph-based Neural Structured Learning in TFX

Args
`transform_output_dir`	The directory containig tf.Transform output.

Attributes
`post_transform_statistics_path`	Returns the path to the post-transform datum statistics. Note: post_transform_statistics is not guaranteed to exist in the output of tf.transform and hence using this could fail, if post_transform statistics is not present in TFTransformOutput.
`pre_transform_statistics_path`	Returns the path to the pre-transform datum statistics. Note: pre_transform_statistics is not guaranteed to exist in the output of tf.transform and hence using this could fail, if pre_transform statistics is not present in TFTransformOutput.
`raw_metadata`	A DatasetMetadata. Note: raw_metadata is not guaranteed to exist in the output of tf.transform and hence using this could fail, if raw_metadata is not present in TFTransformOutput.
`transform_savedmodel_dir`	A python str.
`transformed_metadata`	A DatasetMetadata.

Methods

`load_transform_graph`

View source

load_transform_graph()

Load the transform graph without replacing any placeholders.

This is necessary to ensure that variables in the transform graph are included in the training checkpoint when using tf.Estimator. This should be called in the training input_fn.

`num_buckets_for_transformed_feature`

View source

num_buckets_for_transformed_feature(
    name: str
) -> int

Returns the number of buckets for an integerized transformed feature.

`raw_domains`

View source

raw_domains() -> Dict[str, common_types.DomainType]

Returns domains for the raw features.

Returns
A dict from feature names to one of schema_pb2.IntDomain, schema_pb2.StringDomain or schema_pb2.FloatDomain.

`raw_feature_spec`

View source

raw_feature_spec() -> Dict[str, common_types.FeatureSpecType]

Returns a feature_spec for the raw features.

Returns
A dict from feature names to FixedLenFeature/SparseFeature/VarLenFeature.

`transform_features_layer`

View source

transform_features_layer() -> tf_keras.Model

Creates a TransformFeaturesLayer from this transform output.

If a TransformFeaturesLayer has already been created for self, the same one will be returned.

Returns
A `TransformFeaturesLayer` instance.

`transform_raw_features`

View source

transform_raw_features(
    raw_features: Mapping[str, common_types.TensorType],
    drop_unused_features: bool = True
) -> Dict[str, common_types.TensorType]

Takes a dict of tensors representing raw features and transforms them.

Takes a dictionary of Tensor, SparseTensor, or RaggedTensors that represent the raw features, and applies the transformation defined by tf.Transform.

If False it returns all transformed features defined by tf.Transform. To only return features transformed from the given 'raw_features', set drop_unused_features to True.

Args
`raw_features`	A dict whose keys are feature names and values are `Tensor`s, `SparseTensor`s, or `RaggedTensor`s.
`drop_unused_features`	If True, the result will be filtered. Only the features that are transformed from 'raw_features' will be included in the returned result. If a feature is transformed from multiple raw features (e.g, feature cross), it will only be included if all its base raw features are present in `raw_features`.

Returns
A dict whose keys are feature names and values are `Tensor`s, `SparseTensor`s, or `RaggedTensor`s representing transformed features.

`transformed_domains`

View source

transformed_domains() -> Dict[str, common_types.DomainType]

Returns domains for the transformed features.

Returns
A dict from feature names to one of schema_pb2.IntDomain, schema_pb2.StringDomain or schema_pb2.FloatDomain.

`transformed_feature_spec`

View source

transformed_feature_spec() -> Dict[str, common_types.FeatureSpecType]

Returns a feature_spec for the transformed features.

Returns
A dict from feature names to FixedLenFeature/SparseFeature/VarLenFeature.

`vocabulary_by_name`

View source

vocabulary_by_name(
    vocab_filename: str
) -> List[bytes]

Like vocabulary_file_by_name but returns a list.

`vocabulary_file_by_name`

View source

vocabulary_file_by_name(
    vocab_filename: str
) -> Optional[str]

Returns the vocabulary file path created in the preprocessing function.

vocab_filename must either be (i) the name used as the vocab_filename argument to tft.compute_and_apply_vocabulary / tft.vocabulary or (ii) the key used in tft.annotate_asset.

When a mapping has been specified by calls to tft.annotate_asset, it will be checked first for the provided filename. If present, this filename will be used directly to construct a path.

If the mapping does not exist or vocab_filename is not present within it, we will default to sanitizing vocab_filename and searching for files matching it within the assets directory.

In either case, if the constructed path does not point to an existing file within the assets subdirectory, we will return a None.

Args
`vocab_filename`	The vocabulary name to lookup.

`vocabulary_size_by_name`

View source

vocabulary_size_by_name(
    vocab_filename: str
) -> int

Like vocabulary_file_by_name, but returns the size of vocabulary.

Class Variables
ASSET_MAP	`'asset_map'`
POST_TRANSFORM_FEATURE_STATS_PATH	`'post_transform_feature_stats/FeatureStats.pb'`
PRE_TRANSFORM_FEATURE_STATS_PATH	`'pre_transform_feature_stats/FeatureStats.pb'`
RAW_METADATA_DIR	`'metadata'`
TRANSFORMED_METADATA_DIR	`'transformed_metadata'`
TRANSFORM_FN_DIR	`'transform_fn'`

tft.TFTransformOutput Stay organized with collections Save and categorize content based on your preferences.

Used in the notebooks

Args

Attributes

Methods

load_transform_graph

num_buckets_for_transformed_feature

raw_domains

raw_feature_spec

transform_features_layer

transform_raw_features

transformed_domains

transformed_feature_spec

vocabulary_by_name

vocabulary_file_by_name

vocabulary_size_by_name

Class Variables

tft.TFTransformOutput

`load_transform_graph`

`num_buckets_for_transformed_feature`

`raw_domains`

`raw_feature_spec`

`transform_features_layer`

`transform_raw_features`

`transformed_domains`

`transformed_feature_spec`

`vocabulary_by_name`

`vocabulary_file_by_name`

`vocabulary_size_by_name`