Questions about TFX? Join us at Google I/O!


Returns a column which is the input column scaled to have range [0,1].

x A numeric Tensor or SparseTensor.
key A Tensor or SparseTensor of type string.
elementwise If true, scale each element of the tensor independently.
key_vocabulary_filename (Optional) The file name for the per-key file. If None, this combiner will assume the keys fit in memory and will not store the analyzer result in a file. If '', a file name will be chosen based on the current TensorFlow scope. If not '', it should be unique within a given preprocessing function.
name (Optional) A name for this operation.


def preprocessing_fn(inputs):
  return {
     'scaled': tft.scale_to_0_1_per_key(inputs['x'], inputs['s'])
raw_data = [dict(x=1, s='a'), dict(x=0, s='b'), dict(x=3, s='a')]
feature_spec = dict([], tf.float32),[], tf.string))
raw_data_metadata = tft.tf_metadata.dataset_metadata.DatasetMetadata(
with tft_beam.Context(temp_dir=tempfile.mkdtemp()):
  transformed_dataset, transform_fn = (
      (raw_data, raw_data_metadata)
      | tft_beam.AnalyzeAndTransformDataset(preprocessing_fn))
transformed_data, transformed_metadata = transformed_dataset
[{'scaled': 0.0}, {'scaled': 0.5}, {'scaled': 1.0}]

A Tensor or SparseTensor containing the input column scaled to [0, 1], per key. If the analysis dataset is empty, contains a single distinct value or the computed key vocabulary doesn't have an entry for key, then x is scaled using a sigmoid function.