tfdv.TransformStatsGenerator

A StatsGenerator which wraps an arbitrary Beam PTransform.

This class computes statistics using a user-provided Beam PTransform. The PTransform must accept a Beam PCollection where each element is a tuple containing a slice key and an Arrow RecordBatch representing a batch of examples. It must return a PCollection where each element is a tuple containing a slice key and a DatasetFeatureStatistics proto representing the statistics of a slice.

name A unique name associated with the statistics generator.
schema An optional schema for the dataset.

name

ptransform

schema