tfdv.CombinerStatsGenerator

Generate statistics using combiner function.

This object mirrors a beam.CombineFn except for the add_input interface, which is expected to be defined by its sub-classes.

name A unique name associated with the statistics generator.
schema An optional schema for the dataset.

name

schema

Methods

add_input

View source

Returns result of folding a batch of inputs into accumulator.

Args
accumulator The current accumulator.
input_record_batch An Arrow RecordBatch whose columns are features and rows are examples. The columns are of type List or Null (If a feature's value is None across all the examples in the batch, its corresponding column is of Null type).

Returns
The accumulator after updating the statistics for the batch of inputs.

create_accumulator

View source

Returns a fresh, empty accumulator.

Returns
An empty accumulator.

extract_output

View source

Returns result of converting accumulator into the output value.

Args
accumulator The final accumulator value.

Returns
A proto representing the result of this stats generator.

merge_accumulators

View source

Merges several accumulators to a single accumulator value.

Args
accumulators The accumulators to merge.

Returns
The merged accumulator.