View source on GitHub |
A wrapper around the output of the tf.Transform.
tft.TFTransformOutput(
transform_output_dir: str
)
Used in the notebooks
Used in the tutorials |
---|
Args | |
---|---|
transform_output_dir
|
The directory containig tf.Transform output. |
Methods
load_transform_graph
load_transform_graph()
Load the transform graph without replacing any placeholders.
This is necessary to ensure that variables in the transform graph are included in the training checkpoint when using tf.Estimator. This should be called in the training input_fn.
num_buckets_for_transformed_feature
num_buckets_for_transformed_feature(
name: str
) -> int
Returns the number of buckets for an integerized transformed feature.
raw_domains
raw_domains() -> Dict[str, common_types.DomainType]
Returns domains for the raw features.
Returns | |
---|---|
A dict from feature names to one of schema_pb2.IntDomain, schema_pb2.StringDomain or schema_pb2.FloatDomain. |
raw_feature_spec
raw_feature_spec() -> Dict[str, common_types.FeatureSpecType]
Returns a feature_spec for the raw features.
Returns | |
---|---|
A dict from feature names to FixedLenFeature/SparseFeature/VarLenFeature. |
transform_features_layer
transform_features_layer() -> tf_keras.Model
Creates a TransformFeaturesLayer
from this transform output.
If a TransformFeaturesLayer
has already been created for self, the same
one will be returned.
Returns | |
---|---|
A TransformFeaturesLayer instance.
|
transform_raw_features
transform_raw_features(
raw_features: Mapping[str, common_types.TensorType],
drop_unused_features: bool = True
) -> Dict[str, common_types.TensorType]
Takes a dict of tensors representing raw features and transforms them.
Takes a dictionary of Tensor
, SparseTensor
, or RaggedTensor
s that
represent the raw features, and applies the transformation defined by
tf.Transform.
If False it returns all transformed features defined by tf.Transform. To
only return features transformed from the given 'raw_features', set
drop_unused_features
to True.
Args | |
---|---|
raw_features
|
A dict whose keys are feature names and values are
Tensor s, SparseTensor s, or RaggedTensor s.
|
drop_unused_features
|
If True, the result will be filtered. Only the
features that are transformed from 'raw_features' will be included in
the returned result. If a feature is transformed from multiple raw
features (e.g, feature cross), it will only be included if all its base
raw features are present in raw_features .
|
Returns | |
---|---|
A dict whose keys are feature names and values are Tensor s,
SparseTensor s, or RaggedTensor s representing transformed features.
|
transformed_domains
transformed_domains() -> Dict[str, common_types.DomainType]
Returns domains for the transformed features.
Returns | |
---|---|
A dict from feature names to one of schema_pb2.IntDomain, schema_pb2.StringDomain or schema_pb2.FloatDomain. |
transformed_feature_spec
transformed_feature_spec() -> Dict[str, common_types.FeatureSpecType]
Returns a feature_spec for the transformed features.
Returns | |
---|---|
A dict from feature names to FixedLenFeature/SparseFeature/VarLenFeature. |
vocabulary_by_name
vocabulary_by_name(
vocab_filename: str
) -> List[bytes]
Like vocabulary_file_by_name but returns a list.
vocabulary_file_by_name
vocabulary_file_by_name(
vocab_filename: str
) -> Optional[str]
Returns the vocabulary file path created in the preprocessing function.
vocab_filename
must either be (i) the name used as the vocab_filename
argument to tft.compute_and_apply_vocabulary / tft.vocabulary or (ii) the
key used in tft.annotate_asset.
When a mapping has been specified by calls to tft.annotate_asset, it will be checked first for the provided filename. If present, this filename will be used directly to construct a path.
If the mapping does not exist or vocab_filename
is not present within it,
we will default to sanitizing vocab_filename
and searching for files
matching it within the assets directory.
In either case, if the constructed path does not point to an existing file within the assets subdirectory, we will return a None.
Args | |
---|---|
vocab_filename
|
The vocabulary name to lookup. |
vocabulary_size_by_name
vocabulary_size_by_name(
vocab_filename: str
) -> int
Like vocabulary_file_by_name, but returns the size of vocabulary.