A StatsGenerator which wraps an arbitrary Beam PTransform.
tfdv.TransformStatsGenerator(
name: Text,
ptransform: beam.PTransform,
schema: Optional[schema_pb2.Schema] = None
) -> None
This class computes statistics using a user-provided Beam PTransform. The
PTransform must accept a Beam PCollection where each element is a tuple
containing a slice key and an Arrow RecordBatch representing a batch of
examples. It must return a PCollection where each element is a tuple
containing a slice key and a DatasetFeatureStatistics proto representing the
statistics of a slice.
Args |
name
|
A unique name associated with the statistics generator.
|
schema
|
An optional schema for the dataset.
|
Attributes |
name
|
|
ptransform
|
|
schema
|
|