tft.experimental.CacheablePTransformAnalyzer

A PTransformAnalyzer which enables analyzer cache.

tft.experimental.CacheablePTransformAnalyzer(
    make_accumulators_ptransform,
    merge_accumulators_ptransform,
    extract_output_ptransform,
    cache_coder
)

make_accumulators_ptransform: this is a beam.PTransform which maps data to a more compact mergeable representation (accumulator). Mergeable here means that it is possible to combine multiple representations produced from a partition of the dataset into a representation of the entire dataset.
merge_accumulators_ptransform: this is a beam.PTransform which operates on a collection of accumulators, i.e. the results of both the make_accumulators_ptransform and merge_accumulators_ptransform stages, and produces a single reduced accumulator. This operation must be associative and commutative in order to have reliably reproducible results.
extract_output: this is a beam.PTransform which operates on the result of the merge_accumulators_ptransform stage, and produces the outputs of the analyzer. These outputs must be consistent with the output_dtypes and output_shapes provided to ptransform_analyzer.

This container also holds a cache_coder (PTransformAnalyzerCacheCoder) which can encode outputs and decode the inputs of the merge_accumulators_ptransform stage. In many cases, SimpleJsonPTransformAnalyzerCacheCoder would be sufficient.

To ensure the correctness of this analyzer, the following must hold: merge(make({D1, ..., Dn})) == merge({make(D1), ..., make(Dn)})

Attributes
`make_accumulators_ptransform`	A `namedtuple` alias for field number 0
`merge_accumulators_ptransform`	A `namedtuple` alias for field number 1
`extract_output_ptransform`	A `namedtuple` alias for field number 2
`cache_coder`	A `namedtuple` alias for field number 3

tft.experimental.CacheablePTransformAnalyzer

Attributes