Join the SIG TFX-Addons community and help make TFX even better!

Feature Engineering using TFX Pipeline and TensorFlow Transform

Transform input data and traing a model with a TFX pipeline.

In this notebook-based tutorial, we will create and run a TFX pipeline to ingest raw input data and preprocess it appropriately for ML training. This notebook is based on the TFX pipeline we built in Data validation using TFX Pipeline and TensorFlow Data Validation Tutorial. If you have not read that one yet, you should read it before proceeding with this notebook.

You can increase the predictive quality of your data and/or reduce dimensionality with feature engineering. One of the benefits of using TFX is that you will write your transformation code once, and the resulting transforms will be consistent between training and serving in order to avoid training/serving skew.

We will add a Transform component to the pipeline. The Transform component is implemented using the tf.transform library.

Please see Understanding TFX Pipelines to learn more about various concepts in TFX.

Set Up

We first need to install the TFX Python package and download the dataset which we will use for our model.

Upgrade Pip

To avoid upgrading Pip in a system when running locally, check to make sure that we are running in Colab. Local systems can of course be upgraded separately.

try:
  import colab
  !pip install -q --upgrade pip
except:
  pass

Install TFX

# TODO(b/178712706): Stop using legacy resolver after PIP issue is resolved.
pip install -q -U --use-deprecated=legacy-resolver tfx

Did you restart the runtime?

If you are using Google Colab, the first time that you run the cell above, you must restart the runtime by clicking above "RESTART RUNTIME" button or using "Runtime > Restart runtime ..." menu. This is because of the way that Colab loads packages.

Check the TensorFlow and TFX versions.

import tensorflow as tf
print('TensorFlow version: {}'.format(tf.__version__))
import tfx
print('TFX version: {}'.format(tfx.__version__))
TensorFlow version: 2.4.1
TFX version: 0.29.0

Set up variables

There are some variables used to define a pipeline. You can customize these variables as you want. By default all output from the pipeline will be generated under the current directory.

import os

PIPELINE_NAME = "penguin-transform"

# Output directory to store artifacts generated from the pipeline.
PIPELINE_ROOT = os.path.join('pipelines', PIPELINE_NAME)
# Path to a SQLite DB file to use as an MLMD storage.
METADATA_PATH = os.path.join('metadata', PIPELINE_NAME, 'metadata.db')
# Output directory where created models from the pipeline will be exported.
SERVING_MODEL_DIR = os.path.join('serving_model', PIPELINE_NAME)

from absl import logging
logging.set_verbosity(logging.INFO)  # Set default logging level.

Prepare example data

We will download the example dataset for use in our TFX pipeline. The dataset we are using is Palmer Penguins dataset.

However, unlike previous tutorials which used an already preprocessed dataset, we will use the raw Palmer Penguins dataset.

Because the TFX ExampleGen component reads inputs from a directory, we need to create a directory and copy the dataset to it.

import urllib.request
import tempfile

DATA_ROOT = tempfile.mkdtemp(prefix='tfx-data')  # Create a temporary directory.
_data_path = 'https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins_size.csv'
_data_filepath = os.path.join(DATA_ROOT, "data.csv")
urllib.request.urlretrieve(_data_path, _data_filepath)
('/tmp/tfx-datay3dg5_38/data.csv', <http.client.HTTPMessage at 0x7f6ebf318c18>)

Take a quick look at what the raw data looks like.

head {_data_filepath}
species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex
Adelie,Torgersen,39.1,18.7,181,3750,MALE
Adelie,Torgersen,39.5,17.4,186,3800,FEMALE
Adelie,Torgersen,40.3,18,195,3250,FEMALE
Adelie,Torgersen,NA,NA,NA,NA,NA
Adelie,Torgersen,36.7,19.3,193,3450,FEMALE
Adelie,Torgersen,39.3,20.6,190,3650,MALE
Adelie,Torgersen,38.9,17.8,181,3625,FEMALE
Adelie,Torgersen,39.2,19.6,195,4675,MALE
Adelie,Torgersen,34.1,18.1,193,3475,NA

There are some entries with missing values which are represented as NA. We will just delete those entries in this tutorial.

sed -i '/\bNA\b/d' {_data_filepath}
head {_data_filepath}
species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex
Adelie,Torgersen,39.1,18.7,181,3750,MALE
Adelie,Torgersen,39.5,17.4,186,3800,FEMALE
Adelie,Torgersen,40.3,18,195,3250,FEMALE
Adelie,Torgersen,36.7,19.3,193,3450,FEMALE
Adelie,Torgersen,39.3,20.6,190,3650,MALE
Adelie,Torgersen,38.9,17.8,181,3625,FEMALE
Adelie,Torgersen,39.2,19.6,195,4675,MALE
Adelie,Torgersen,41.1,17.6,182,3200,FEMALE
Adelie,Torgersen,38.6,21.2,191,3800,MALE

You should be able to see seven features which describe penguins. We will use the same set of features as the previous tutorials - 'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g' - and will predict the 'species' of a penguin.

The only difference will be that the input data is not preprocessed. Note that we will not use other features like 'island' or 'sex' in this tutorial.

Prepare a schema file

As described in Data validation using TFX Pipeline and TensorFlow Data Validation Tutorial, we need a schema file for the dataset. Because the dataset is different from the previous tutorial we need to generate it again. In this tutorial, we will skip those steps and just use a prepared schema file.

import shutil

SCHEMA_PATH = 'schema'

_schema_uri = 'https://raw.githubusercontent.com/tensorflow/tfx/master/tfx/examples/penguin/schema/raw/schema.pbtxt'
_schema_filename = 'schema.pbtxt'
_schema_filepath = os.path.join(SCHEMA_PATH, _schema_filename)

os.makedirs(SCHEMA_PATH, exist_ok=True)
urllib.request.urlretrieve(_schema_uri, _schema_filepath)
('schema/schema.pbtxt', <http.client.HTTPMessage at 0x7f6ebf3316d8>)

This schema file was created with the same pipeline as in the previous tutorial without any manual changes.

Create a pipeline

TFX pipelines are defined using Python APIs. We will add Transform component to the pipeline we created in the Data Validation tutorial.

A Transform component requires input data from an ExampleGen component and a schema from a SchemaGen component, and produces a "transform graph". The output will be used in a Trainer component. Transform can optionally produce "transformed data" in addition, which is the materialized data after transformation. However, we will transform data during training in this tutorial without materialization of the intermediate transformed data.

One thing to note is that we need to define a Python function, preprocessing_fn to describe how input data should be transformed. This is similar to a Trainer component which also requires user code for model definition.

Write preprocessing and training code

We need to define two Python functions. One for Transform and one for Trainer.

preprocessing_fn

The Transform component will find a function named preprocessing_fn in the given module file as we did for Trainer component. You can also specify a specific function using the preprocessing_fn parameter of the Transform component.

In this example, we will do two kinds of transformation. For continuous numeric features like culmen_length_mm and body_mass_g, we will normalize these values using the tft.scale_to_z_score function. For the label feature, we need to convert string labels into numeric index values. We will use tf.lookup.StaticHashTable for conversion.

To identify transformed fields easily, we append a _xf suffix to the transformed feature names.

run_fn

The model itself is almost the same as in the previous tutorials, but this time we will transform the input data using the transform graph from the Transform component.

One more important difference compared to the previous tutorial is that now we export a model for serving which includes not only the computation graph of the model, but also the transform graph for preprocessing, which is generated in Transform component. We need to define a separate function which will be used for serving incoming requests. You can see that the same function _apply_preprocessing was used for both of the training data and the serving request.

_module_file = 'penguin_utils.py'
%%writefile {_module_file}


from typing import List, Text
from absl import logging
import tensorflow as tf
from tensorflow import keras
from tensorflow_metadata.proto.v0 import schema_pb2
import tensorflow_transform as tft
from tensorflow_transform.tf_metadata import schema_utils

from tfx.components.trainer.executor import TrainerFnArgs
from tfx.components.trainer.fn_args_utils import DataAccessor
from tfx_bsl.tfxio import dataset_options

# Specify features that we will use.
_FEATURE_KEYS = [
    'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g'
]
_LABEL_KEY = 'species'

_TRAIN_BATCH_SIZE = 20
_EVAL_BATCH_SIZE = 10


# NEW: Transformed features will have '_xf' suffix.
def _transformed_name(key):
  return key + '_xf'


# NEW: TFX Transform will call this function.
def preprocessing_fn(inputs):
  """tf.transform's callback function for preprocessing inputs.

  Args:
    inputs: map from feature keys to raw not-yet-transformed features.

  Returns:
    Map from string feature key to transformed feature.
  """
  outputs = {}

  # Uses features defined in _FEATURE_KEYS only.
  for key in _FEATURE_KEYS:
    # tft.scale_to_z_score computes the mean and variance of the given feature
    # and scales the output based on the result.
    outputs[_transformed_name(key)] = tft.scale_to_z_score(inputs[key])

  # For the label column we provide the mapping from string to index.
  # We could instead use `tft.compute_and_apply_vocabulary()` in order to
  # compute the vocabulary dynamically and perform a lookup.
  # Since in this example there are only 3 possible values, we use a hard-coded
  # table for simplicity.
  table_keys = ['Adelie', 'Chinstrap', 'Gentoo']
  initializer = tf.lookup.KeyValueTensorInitializer(
      keys=table_keys,
      values=tf.cast(tf.range(len(table_keys)), tf.int64),
      key_dtype=tf.string,
      value_dtype=tf.int64)
  table = tf.lookup.StaticHashTable(initializer, default_value=-1)
  outputs[_transformed_name(_LABEL_KEY)] = table.lookup(inputs[_LABEL_KEY])

  return outputs


# NEW: This function will apply the same transform operation to training data
#      and serving requests.
def _apply_preprocessing(raw_features, tft_layer):
  transformed_features = tft_layer(raw_features)
  if _LABEL_KEY in raw_features:
    transformed_label = transformed_features.pop(_transformed_name(_LABEL_KEY))
    return transformed_features, transformed_label
  else:
    return transformed_features, None


# NEW: This function will create a handler function which gets a serialized
#      tf.example, preprocess and run an inference with it.
def _get_serve_tf_examples_fn(model, tf_transform_output):
  # We must save the tft_layer to the model to ensure its assets are kept and
  # tracked.
  model.tft_layer = tf_transform_output.transform_features_layer()

  @tf.function(input_signature=[
      tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
  ])
  def serve_tf_examples_fn(serialized_tf_examples):
    # Expected input is a string which is serialized tf.Example format.
    feature_spec = tf_transform_output.raw_feature_spec()
    # Because input schema includes unnecessary fields like 'species' and
    # 'island', we filter feature_spec to include required keys only.
    required_feature_spec = {
        k: v for k, v in feature_spec.items() if k in _FEATURE_KEYS
    }
    parsed_features = tf.io.parse_example(serialized_tf_examples,
                                          required_feature_spec)

    # Preprocess parsed input with transform operation defined in
    # preprocessing_fn().
    transformed_features, _ = _apply_preprocessing(parsed_features,
                                                   model.tft_layer)
    # Run inference with ML model.
    return model(transformed_features)

  return serve_tf_examples_fn


def _input_fn(file_pattern: List[Text],
              data_accessor: DataAccessor,
              tf_transform_output: tft.TFTransformOutput,
              batch_size: int = 200) -> tf.data.Dataset:
  """Generates features and label for tuning/training.

  Args:
    file_pattern: List of paths or patterns of input tfrecord files.
    data_accessor: DataAccessor for converting input to RecordBatch.
    tf_transform_output: A TFTransformOutput.
    batch_size: representing the number of consecutive elements of returned
      dataset to combine in a single batch

  Returns:
    A dataset that contains (features, indices) tuple where features is a
      dictionary of Tensors, and indices is a single Tensor of label indices.
  """
  dataset = data_accessor.tf_dataset_factory(
      file_pattern,
      dataset_options.TensorFlowDatasetOptions(batch_size=batch_size),
      schema=tf_transform_output.raw_metadata.schema)

  transform_layer = tf_transform_output.transform_features_layer()
  def apply_transform(raw_features):
    return _apply_preprocessing(raw_features, transform_layer)

  return dataset.map(apply_transform).repeat()


def _build_keras_model() -> tf.keras.Model:
  """Creates a DNN Keras model for classifying penguin data.

  Returns:
    A Keras Model.
  """
  # The model below is built with Functional API, please refer to
  # https://www.tensorflow.org/guide/keras/overview for all API options.
  inputs = [
      keras.layers.Input(shape=(1,), name=_transformed_name(f))
      for f in _FEATURE_KEYS
  ]
  d = keras.layers.concatenate(inputs)
  for _ in range(2):
    d = keras.layers.Dense(8, activation='relu')(d)
  outputs = keras.layers.Dense(3)(d)

  model = keras.Model(inputs=inputs, outputs=outputs)
  model.compile(
      optimizer=keras.optimizers.Adam(1e-2),
      loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
      metrics=[keras.metrics.SparseCategoricalAccuracy()])

  model.summary(print_fn=logging.info)
  return model


# TFX Trainer will call this function.
def run_fn(fn_args: TrainerFnArgs):
  """Train the model based on given args.

  Args:
    fn_args: Holds args used to train the model as name/value pairs.
  """
  tf_transform_output = tft.TFTransformOutput(fn_args.transform_output)

  train_dataset = _input_fn(
      fn_args.train_files,
      fn_args.data_accessor,
      tf_transform_output,
      batch_size=_TRAIN_BATCH_SIZE)
  eval_dataset = _input_fn(
      fn_args.eval_files,
      fn_args.data_accessor,
      tf_transform_output,
      batch_size=_EVAL_BATCH_SIZE)

  model = _build_keras_model()
  model.fit(
      train_dataset,
      steps_per_epoch=fn_args.train_steps,
      validation_data=eval_dataset,
      validation_steps=fn_args.eval_steps)

  # NEW: Save a computation graph including transform layer.
  signatures = {
      'serving_default': _get_serve_tf_examples_fn(model, tf_transform_output),
  }
  model.save(fn_args.serving_model_dir, save_format='tf', signatures=signatures)
Writing penguin_utils.py

Now you have completed all of the preparation steps to build a TFX pipeline.

Write a pipeline definition

We define a function to create a TFX pipeline. A Pipeline object represents a TFX pipeline, which can be run using one of the pipeline orchestration systems that TFX supports.

from typing import List, Optional

from tfx.components import CsvExampleGen
from tfx.components import ExampleValidator
from tfx.components import Pusher
from tfx.components import StatisticsGen
from tfx.components import Trainer
from tfx.components import Transform
from tfx.components.trainer.executor import GenericExecutor
from tfx.dsl.components.base import executor_spec
from tfx.dsl.components.common.importer import Importer
from tfx.orchestration import metadata
from tfx.orchestration import pipeline
from tfx.proto import pusher_pb2
from tfx.proto import trainer_pb2
from tfx.types import standard_artifacts

def _create_pipeline(pipeline_name: str, pipeline_root: str, data_root: str,
                     schema_path: str, module_file: str, serving_model_dir: str,
                     metadata_path: str) -> pipeline.Pipeline:
  """Implements the penguin pipeline with TFX."""
  # Brings data into the pipeline or otherwise joins/converts training data.
  example_gen = CsvExampleGen(input_base=data_root)

  # Computes statistics over data for visualization and example validation.
  statistics_gen = StatisticsGen(examples=example_gen.outputs['examples'])

  # Import the schema.
  schema_importer = Importer(
      instance_name='import_schema',
      source_uri=schema_path,
      artifact_type=standard_artifacts.Schema)

  # Performs anomaly detection based on statistics and data schema.
  example_validator = ExampleValidator(
      statistics=statistics_gen.outputs['statistics'],
      schema=schema_importer.outputs['result'])

  # NEW: Transforms input data using preprocessing_fn in the 'module_file'.
  transform = Transform(
      examples=example_gen.outputs['examples'],
      schema=schema_importer.outputs['result'],
      materialize=False,
      module_file=module_file)

  # Uses user-provided Python function that trains a model.
  trainer = Trainer(
      module_file=module_file,
      custom_executor_spec=executor_spec.ExecutorClassSpec(GenericExecutor),
      examples=example_gen.outputs['examples'],

      # NEW: Pass transform_graph to the trainer.
      transform_graph=transform.outputs['transform_graph'],

      train_args=trainer_pb2.TrainArgs(num_steps=100),
      eval_args=trainer_pb2.EvalArgs(num_steps=5))

  # Pushes the model to a filesystem destination.
  pusher = Pusher(
      model=trainer.outputs['model'],
      push_destination=pusher_pb2.PushDestination(
          filesystem=pusher_pb2.PushDestination.Filesystem(
              base_directory=serving_model_dir)))

  components = [
      example_gen,
      statistics_gen,
      schema_importer,
      example_validator,

      transform,  # NEW: Transform component was added to the pipeline.

      trainer,
      pusher,
  ]

  return pipeline.Pipeline(
      pipeline_name=pipeline_name,
      pipeline_root=pipeline_root,
      metadata_connection_config=metadata.sqlite_metadata_connection_config(
          metadata_path),
      components=components)
INFO:absl:tensorflow_ranking is not available: No module named 'tensorflow_ranking'
INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text'
INFO:absl:tensorflow_text is not available: No module named 'tensorflow_text'
WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.

Run the pipeline

We will use LocalDagRunner as in the previous tutorial.

import os
from tfx.orchestration.local import local_dag_runner

local_dag_runner.LocalDagRunner().run(
  _create_pipeline(
      pipeline_name=PIPELINE_NAME,
      pipeline_root=PIPELINE_ROOT,
      data_root=DATA_ROOT,
      schema_path=SCHEMA_PATH,
      module_file=_module_file,
      serving_model_dir=SERVING_MODEL_DIR,
      metadata_path=METADATA_PATH))
INFO:absl:Excluding no splits because exclude_splits is not set.
WARNING:absl:`instance_name` is deprecated, please set the node id directly using `with_id()` or the `.id` setter.
INFO:absl:Excluding no splits because exclude_splits is not set.
INFO:absl:Running pipeline:
 pipeline_info {
  id: "penguin-transform"
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen"
      }
      id: "CsvExampleGen"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.CsvExampleGen"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "examples"
        value {
          artifact_spec {
            type {
              name: "Examples"
              properties {
                key: "span"
                value: INT
              }
              properties {
                key: "split_names"
                value: STRING
              }
              properties {
                key: "version"
                value: INT
              }
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "input_base"
        value {
          field_value {
            string_value: "/tmp/tfx-datay3dg5_38"
          }
        }
      }
      parameters {
        key: "input_config"
        value {
          field_value {
            string_value: "{\n  \"splits\": [\n    {\n      \"name\": \"single_split\",\n      \"pattern\": \"*\"\n    }\n  ]\n}"
          }
        }
      }
      parameters {
        key: "output_config"
        value {
          field_value {
            string_value: "{\n  \"split_config\": {\n    \"splits\": [\n      {\n        \"hash_buckets\": 2,\n        \"name\": \"train\"\n      },\n      {\n        \"hash_buckets\": 1,\n        \"name\": \"eval\"\n      }\n    ]\n  }\n}"
          }
        }
      }
      parameters {
        key: "output_data_format"
        value {
          field_value {
            int_value: 6
          }
        }
      }
    }
    downstream_nodes: "StatisticsGen"
    downstream_nodes: "Trainer"
    downstream_nodes: "Transform"
    execution_options {
      caching_options {
      }
    }
  }
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.dsl.components.common.importer.Importer"
      }
      id: "Importer.import_schema"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.Importer.import_schema"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "result"
        value {
          artifact_spec {
            type {
              name: "Schema"
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "artifact_uri"
        value {
          field_value {
            string_value: "schema"
          }
        }
      }
      parameters {
        key: "reimport"
        value {
          field_value {
            int_value: 0
          }
        }
      }
    }
    downstream_nodes: "ExampleValidator"
    downstream_nodes: "Transform"
    execution_options {
      caching_options {
      }
    }
  }
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.components.statistics_gen.component.StatisticsGen"
      }
      id: "StatisticsGen"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.StatisticsGen"
          }
        }
      }
    }
    inputs {
      inputs {
        key: "examples"
        value {
          channels {
            producer_node_query {
              id: "CsvExampleGen"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.CsvExampleGen"
                }
              }
            }
            artifact_query {
              type {
                name: "Examples"
              }
            }
            output_key: "examples"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "statistics"
        value {
          artifact_spec {
            type {
              name: "ExampleStatistics"
              properties {
                key: "span"
                value: INT
              }
              properties {
                key: "split_names"
                value: STRING
              }
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "exclude_splits"
        value {
          field_value {
            string_value: "[]"
          }
        }
      }
    }
    upstream_nodes: "CsvExampleGen"
    downstream_nodes: "ExampleValidator"
    execution_options {
      caching_options {
      }
    }
  }
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.components.transform.component.Transform"
      }
      id: "Transform"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.Transform"
          }
        }
      }
    }
    inputs {
      inputs {
        key: "examples"
        value {
          channels {
            producer_node_query {
              id: "CsvExampleGen"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.CsvExampleGen"
                }
              }
            }
            artifact_query {
              type {
                name: "Examples"
              }
            }
            output_key: "examples"
          }
        }
      }
      inputs {
        key: "schema"
        value {
          channels {
            producer_node_query {
              id: "Importer.import_schema"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.Importer.import_schema"
                }
              }
            }
            artifact_query {
              type {
                name: "Schema"
              }
            }
            output_key: "result"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "transform_graph"
        value {
          artifact_spec {
            type {
              name: "TransformGraph"
            }
          }
        }
      }
      outputs {
        key: "updated_analyzer_cache"
        value {
          artifact_spec {
            type {
              name: "TransformCache"
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "custom_config"
        value {
          field_value {
            string_value: "null"
          }
        }
      }
      parameters {
        key: "force_tf_compat_v1"
        value {
          field_value {
            int_value: 1
          }
        }
      }
      parameters {
        key: "module_file"
        value {
          field_value {
            string_value: "penguin_utils.py"
          }
        }
      }
    }
    upstream_nodes: "CsvExampleGen"
    upstream_nodes: "Importer.import_schema"
    downstream_nodes: "Trainer"
    execution_options {
      caching_options {
      }
    }
  }
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.components.example_validator.component.ExampleValidator"
      }
      id: "ExampleValidator"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.ExampleValidator"
          }
        }
      }
    }
    inputs {
      inputs {
        key: "schema"
        value {
          channels {
            producer_node_query {
              id: "Importer.import_schema"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.Importer.import_schema"
                }
              }
            }
            artifact_query {
              type {
                name: "Schema"
              }
            }
            output_key: "result"
          }
        }
      }
      inputs {
        key: "statistics"
        value {
          channels {
            producer_node_query {
              id: "StatisticsGen"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.StatisticsGen"
                }
              }
            }
            artifact_query {
              type {
                name: "ExampleStatistics"
              }
            }
            output_key: "statistics"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "anomalies"
        value {
          artifact_spec {
            type {
              name: "ExampleAnomalies"
              properties {
                key: "span"
                value: INT
              }
              properties {
                key: "split_names"
                value: STRING
              }
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "exclude_splits"
        value {
          field_value {
            string_value: "[]"
          }
        }
      }
    }
    upstream_nodes: "Importer.import_schema"
    upstream_nodes: "StatisticsGen"
    execution_options {
      caching_options {
      }
    }
  }
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.components.trainer.component.Trainer"
      }
      id: "Trainer"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.Trainer"
          }
        }
      }
    }
    inputs {
      inputs {
        key: "examples"
        value {
          channels {
            producer_node_query {
              id: "CsvExampleGen"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.CsvExampleGen"
                }
              }
            }
            artifact_query {
              type {
                name: "Examples"
              }
            }
            output_key: "examples"
          }
        }
      }
      inputs {
        key: "transform_graph"
        value {
          channels {
            producer_node_query {
              id: "Transform"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.Transform"
                }
              }
            }
            artifact_query {
              type {
                name: "TransformGraph"
              }
            }
            output_key: "transform_graph"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "model"
        value {
          artifact_spec {
            type {
              name: "Model"
            }
          }
        }
      }
      outputs {
        key: "model_run"
        value {
          artifact_spec {
            type {
              name: "ModelRun"
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "custom_config"
        value {
          field_value {
            string_value: "null"
          }
        }
      }
      parameters {
        key: "eval_args"
        value {
          field_value {
            string_value: "{\n  \"num_steps\": 5\n}"
          }
        }
      }
      parameters {
        key: "module_file"
        value {
          field_value {
            string_value: "penguin_utils.py"
          }
        }
      }
      parameters {
        key: "train_args"
        value {
          field_value {
            string_value: "{\n  \"num_steps\": 100\n}"
          }
        }
      }
    }
    upstream_nodes: "CsvExampleGen"
    upstream_nodes: "Transform"
    downstream_nodes: "Pusher"
    execution_options {
      caching_options {
      }
    }
  }
}
nodes {
  pipeline_node {
    node_info {
      type {
        name: "tfx.components.pusher.component.Pusher"
      }
      id: "Pusher"
    }
    contexts {
      contexts {
        type {
          name: "pipeline"
        }
        name {
          field_value {
            string_value: "penguin-transform"
          }
        }
      }
      contexts {
        type {
          name: "pipeline_run"
        }
        name {
          field_value {
            string_value: "2021-04-30T09:19:53.944707"
          }
        }
      }
      contexts {
        type {
          name: "node"
        }
        name {
          field_value {
            string_value: "penguin-transform.Pusher"
          }
        }
      }
    }
    inputs {
      inputs {
        key: "model"
        value {
          channels {
            producer_node_query {
              id: "Trainer"
            }
            context_queries {
              type {
                name: "pipeline"
              }
              name {
                field_value {
                  string_value: "penguin-transform"
                }
              }
            }
            context_queries {
              type {
                name: "pipeline_run"
              }
              name {
                field_value {
                  string_value: "2021-04-30T09:19:53.944707"
                }
              }
            }
            context_queries {
              type {
                name: "node"
              }
              name {
                field_value {
                  string_value: "penguin-transform.Trainer"
                }
              }
            }
            artifact_query {
              type {
                name: "Model"
              }
            }
            output_key: "model"
          }
        }
      }
    }
    outputs {
      outputs {
        key: "pushed_model"
        value {
          artifact_spec {
            type {
              name: "PushedModel"
            }
          }
        }
      }
    }
    parameters {
      parameters {
        key: "custom_config"
        value {
          field_value {
            string_value: "null"
          }
        }
      }
      parameters {
        key: "push_destination"
        value {
          field_value {
            string_value: "{\n  \"filesystem\": {\n    \"base_directory\": \"serving_model/penguin-transform\"\n  }\n}"
          }
        }
      }
    }
    upstream_nodes: "Trainer"
    execution_options {
      caching_options {
      }
    }
  }
}
runtime_spec {
  pipeline_root {
    field_value {
      string_value: "pipelines/penguin-transform"
    }
  }
  pipeline_run_id {
    field_value {
      string_value: "2021-04-30T09:19:53.944707"
    }
  }
}
execution_mode: SYNC
deployment_config {
  type_url: "type.googleapis.com/tfx.orchestration.IntermediateDeploymentConfig"
  value: "\n\234\001\n\020ExampleValidator\022\207\001\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\0224\n2tfx.components.example_validator.executor.Executor\n\220\001\n\007Trainer\022\204\001\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\0221\n/tfx.components.trainer.executor.GenericExecutor\n\206\001\n\006Pusher\022|\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\022)\n\'tfx.components.pusher.executor.Executor\n\214\001\n\tTransform\022\177\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\022,\n*tfx.components.transform.executor.Executor\n\243\001\n\rCsvExampleGen\022\221\001\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\022>\n<tfx.components.example_gen.csv_example_gen.executor.Executor\n\226\001\n\rStatisticsGen\022\204\001\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\0221\n/tfx.components.statistics_gen.executor.Executor\022\230\001\n\rCsvExampleGen\022\206\001\nOtype.googleapis.com/tfx.orchestration.executable_spec.PythonClassExecutableSpec\0223\n1tfx.components.example_gen.driver.FileBasedDriver*`\n0type.googleapis.com/ml_metadata.ConnectionConfig\022,\032*\n&metadata/penguin-transform/metadata.db\020\003"
}

INFO:absl:Using deployment config:
 executor_specs {
  key: "CsvExampleGen"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.example_gen.csv_example_gen.executor.Executor"
    }
  }
}
executor_specs {
  key: "ExampleValidator"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.example_validator.executor.Executor"
    }
  }
}
executor_specs {
  key: "Pusher"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.pusher.executor.Executor"
    }
  }
}
executor_specs {
  key: "StatisticsGen"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.statistics_gen.executor.Executor"
    }
  }
}
executor_specs {
  key: "Trainer"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.trainer.executor.GenericExecutor"
    }
  }
}
executor_specs {
  key: "Transform"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.transform.executor.Executor"
    }
  }
}
custom_driver_specs {
  key: "CsvExampleGen"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.example_gen.driver.FileBasedDriver"
    }
  }
}
metadata_connection_config {
  sqlite {
    filename_uri: "metadata/penguin-transform/metadata.db"
    connection_mode: READWRITE_OPENCREATE
  }
}

INFO:absl:Using connection config:
 sqlite {
  filename_uri: "metadata/penguin-transform/metadata.db"
  connection_mode: READWRITE_OPENCREATE
}

INFO:absl:Component CsvExampleGen is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen"
  }
  id: "CsvExampleGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.CsvExampleGen"
      }
    }
  }
}
outputs {
  outputs {
    key: "examples"
    value {
      artifact_spec {
        type {
          name: "Examples"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
          properties {
            key: "version"
            value: INT
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "input_base"
    value {
      field_value {
        string_value: "/tmp/tfx-datay3dg5_38"
      }
    }
  }
  parameters {
    key: "input_config"
    value {
      field_value {
        string_value: "{\n  \"splits\": [\n    {\n      \"name\": \"single_split\",\n      \"pattern\": \"*\"\n    }\n  ]\n}"
      }
    }
  }
  parameters {
    key: "output_config"
    value {
      field_value {
        string_value: "{\n  \"split_config\": {\n    \"splits\": [\n      {\n        \"hash_buckets\": 2,\n        \"name\": \"train\"\n      },\n      {\n        \"hash_buckets\": 1,\n        \"name\": \"eval\"\n      }\n    ]\n  }\n}"
      }
    }
  }
  parameters {
    key: "output_data_format"
    value {
      field_value {
        int_value: 6
      }
    }
  }
}
downstream_nodes: "StatisticsGen"
downstream_nodes: "Trainer"
downstream_nodes: "Transform"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:select span and version = (0, None)
INFO:absl:latest span and version = (0, None)
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 1
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=1, input_dict={}, output_dict=defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1619774391,sum_checksum:1619774391"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "span"
  value {
    string_value: "0"
  }
}
, artifact_type: name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}), exec_properties={'output_data_format': 6, 'output_config': '{\n  "split_config": {\n    "splits": [\n      {\n        "hash_buckets": 2,\n        "name": "train"\n      },\n      {\n        "hash_buckets": 1,\n        "name": "eval"\n      }\n    ]\n  }\n}', 'input_config': '{\n  "splits": [\n    {\n      "name": "single_split",\n      "pattern": "*"\n    }\n  ]\n}', 'input_base': '/tmp/tfx-datay3dg5_38', 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:13161,xor_checksum:1619774391,sum_checksum:1619774391'}, execution_output_uri='pipelines/penguin-transform/CsvExampleGen/.system/executor_execution/1/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/CsvExampleGen/.system/stateful_working_dir/2021-04-30T09:19:53.944707', tmp_dir='pipelines/penguin-transform/CsvExampleGen/.system/executor_execution/1/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen"
  }
  id: "CsvExampleGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.CsvExampleGen"
      }
    }
  }
}
outputs {
  outputs {
    key: "examples"
    value {
      artifact_spec {
        type {
          name: "Examples"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
          properties {
            key: "version"
            value: INT
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "input_base"
    value {
      field_value {
        string_value: "/tmp/tfx-datay3dg5_38"
      }
    }
  }
  parameters {
    key: "input_config"
    value {
      field_value {
        string_value: "{\n  \"splits\": [\n    {\n      \"name\": \"single_split\",\n      \"pattern\": \"*\"\n    }\n  ]\n}"
      }
    }
  }
  parameters {
    key: "output_config"
    value {
      field_value {
        string_value: "{\n  \"split_config\": {\n    \"splits\": [\n      {\n        \"hash_buckets\": 2,\n        \"name\": \"train\"\n      },\n      {\n        \"hash_buckets\": 1,\n        \"name\": \"eval\"\n      }\n    ]\n  }\n}"
      }
    }
  }
  parameters {
    key: "output_data_format"
    value {
      field_value {
        int_value: 6
      }
    }
  }
}
downstream_nodes: "StatisticsGen"
downstream_nodes: "Trainer"
downstream_nodes: "Transform"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-04-30T09:19:53.944707')
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tfx to temp dir /tmp/tmprtbgefme/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmprtbgefme/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmprtbgefme/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmprtbgefme/build/tfx/dist/tfx_ephemeral-0.29.0.tar.gz to beam args
INFO:absl:Generating examples.
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.
INFO:absl:Processing input csv data /tmp/tfx-datay3dg5_38/* to TFExample.
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.
INFO:absl:Examples generated.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 1 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1619774391,sum_checksum:1619774391"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "span"
  value {
    string_value: "0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}) for execution 1
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component CsvExampleGen is finished.
INFO:absl:Component Importer.import_schema is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.dsl.components.common.importer.Importer"
  }
  id: "Importer.import_schema"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Importer.import_schema"
      }
    }
  }
}
outputs {
  outputs {
    key: "result"
    value {
      artifact_spec {
        type {
          name: "Schema"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "artifact_uri"
    value {
      field_value {
        string_value: "schema"
      }
    }
  }
  parameters {
    key: "reimport"
    value {
      field_value {
        int_value: 0
      }
    }
  }
}
downstream_nodes: "ExampleValidator"
downstream_nodes: "Transform"
execution_options {
  caching_options {
  }
}

INFO:absl:Running as an importer node.
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Processing source uri: schema, properties: {}, custom_properties: {}
INFO:absl:Component Importer.import_schema is finished.
INFO:absl:Component StatisticsGen is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.statistics_gen.component.StatisticsGen"
  }
  id: "StatisticsGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.StatisticsGen"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
    }
  }
}
outputs {
  outputs {
    key: "statistics"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
downstream_nodes: "ExampleValidator"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 3
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=3, input_dict={'examples': [Artifact(artifact: id: 1
type_id: 6
uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1619774391,sum_checksum:1619774391"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "payload_format"
  value {
    string_value: "FORMAT_TF_EXAMPLE"
  }
}
custom_properties {
  key: "span"
  value {
    string_value: "0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
create_time_since_epoch: 1619774399009
last_update_time_since_epoch: 1619774399009
, artifact_type: id: 6
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}, output_dict=defaultdict(<class 'list'>, {'statistics': [Artifact(artifact: uri: "pipelines/penguin-transform/StatisticsGen/statistics/3"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:StatisticsGen:statistics:0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}), exec_properties={'exclude_splits': '[]'}, execution_output_uri='pipelines/penguin-transform/StatisticsGen/.system/executor_execution/3/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/StatisticsGen/.system/stateful_working_dir/2021-04-30T09:19:53.944707', tmp_dir='pipelines/penguin-transform/StatisticsGen/.system/executor_execution/3/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.statistics_gen.component.StatisticsGen"
  }
  id: "StatisticsGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.StatisticsGen"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
    }
  }
}
outputs {
  outputs {
    key: "statistics"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
downstream_nodes: "ExampleValidator"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-04-30T09:19:53.944707')
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tfx to temp dir /tmp/tmpzau0zpp4/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmpzau0zpp4/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpzau0zpp4/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmpzau0zpp4/build/tfx/dist/tfx_ephemeral-0.29.0.tar.gz to beam args
INFO:absl:Generating statistics for split train.
INFO:absl:Statistics for split train written to pipelines/penguin-transform/StatisticsGen/statistics/3/Split-train.
INFO:absl:Generating statistics for split eval.
INFO:absl:Statistics for split eval written to pipelines/penguin-transform/StatisticsGen/statistics/3/Split-eval.
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 3 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'statistics': [Artifact(artifact: uri: "pipelines/penguin-transform/StatisticsGen/statistics/3"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:StatisticsGen:statistics:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}) for execution 3
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component StatisticsGen is finished.
INFO:absl:Component Transform is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.transform.component.Transform"
  }
  id: "Transform"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Transform"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
    }
  }
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "Importer.import_schema"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Importer.import_schema"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
    }
  }
}
outputs {
  outputs {
    key: "transform_graph"
    value {
      artifact_spec {
        type {
          name: "TransformGraph"
        }
      }
    }
  }
  outputs {
    key: "updated_analyzer_cache"
    value {
      artifact_spec {
        type {
          name: "TransformCache"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "force_tf_compat_v1"
    value {
      field_value {
        int_value: 1
      }
    }
  }
  parameters {
    key: "module_file"
    value {
      field_value {
        string_value: "penguin_utils.py"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "Importer.import_schema"
downstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 4
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=4, input_dict={'schema': [Artifact(artifact: id: 2
type_id: 8
uri: "schema"
state: LIVE
create_time_since_epoch: 1619774399033
last_update_time_since_epoch: 1619774399033
, artifact_type: id: 8
name: "Schema"
)], 'examples': [Artifact(artifact: id: 1
type_id: 6
uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1619774391,sum_checksum:1619774391"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "payload_format"
  value {
    string_value: "FORMAT_TF_EXAMPLE"
  }
}
custom_properties {
  key: "span"
  value {
    string_value: "0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
create_time_since_epoch: 1619774399009
last_update_time_since_epoch: 1619774399009
, artifact_type: id: 6
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}, output_dict=defaultdict(<class 'list'>, {'updated_analyzer_cache': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/updated_analyzer_cache/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Transform:updated_analyzer_cache:0"
  }
}
, artifact_type: name: "TransformCache"
)], 'transform_graph': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/transform_graph/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Transform:transform_graph:0"
  }
}
, artifact_type: name: "TransformGraph"
)]}), exec_properties={'custom_config': 'null', 'force_tf_compat_v1': 1, 'module_file': 'penguin_utils.py'}, execution_output_uri='pipelines/penguin-transform/Transform/.system/executor_execution/4/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Transform/.system/stateful_working_dir/2021-04-30T09:19:53.944707', tmp_dir='pipelines/penguin-transform/Transform/.system/executor_execution/4/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.transform.component.Transform"
  }
  id: "Transform"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Transform"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
    }
  }
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "Importer.import_schema"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Importer.import_schema"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
    }
  }
}
outputs {
  outputs {
    key: "transform_graph"
    value {
      artifact_spec {
        type {
          name: "TransformGraph"
        }
      }
    }
  }
  outputs {
    key: "updated_analyzer_cache"
    value {
      artifact_spec {
        type {
          name: "TransformCache"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "force_tf_compat_v1"
    value {
      field_value {
        int_value: 1
      }
    }
  }
  parameters {
    key: "module_file"
    value {
      field_value {
        string_value: "penguin_utils.py"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "Importer.import_schema"
downstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-04-30T09:19:53.944707')
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tfx to temp dir /tmp/tmpdnbz610h/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmpdnbz610h/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpdnbz610h/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmpdnbz610h/build/tfx/dist/tfx_ephemeral-0.29.0.tar.gz to beam args
INFO:absl:Analyze the 'train' split and transform all splits when splits_config is not set.
WARNING:absl:The default value of `force_tf_compat_v1` will change in a future release from `True` to `False`. Since this pipeline has TF 2 behaviors enabled, Transform will use native TF 2 at that point. You can test this behavior now by passing `force_tf_compat_v1=False` or disable it by explicitly setting `force_tf_compat_v1=True` in the Transform component.
INFO:absl:Loading source_path penguin_utils.py as name user_module_0 because it has not been loaded before.
INFO:absl:penguin_utils.py is already loaded, reloading
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_transform/tf_utils.py:266: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_transform/tf_utils.py:266: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
WARNING:absl:Not using the in-place Transform because the following features require analyzing: ('body_mass_g', 'culmen_depth_mm', 'culmen_length_mm', 'flipper_length_mm')
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
WARNING:root:This output type hint will be ignored and not used for type-checking purposes. Typically, output type hints for a PTransform are single (or nested) types wrapped by a PCollection, PDone, or None. Got: Tuple[Dict[str, Union[NoneType, _Dataset]], Union[Dict[str, Dict[str, PCollection]], NoneType]] instead.
WARNING:root:This output type hint will be ignored and not used for type-checking purposes. Typically, output type hints for a PTransform are single (or nested) types wrapped by a PCollection, PDone, or None. Got: Tuple[Dict[str, Union[NoneType, _Dataset]], Union[Dict[str, Dict[str, PCollection]], NoneType]] instead.
WARNING:tensorflow:Tensorflow version (2.4.1) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended.
WARNING:tensorflow:Tensorflow version (2.4.1) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:No assets to write.
WARNING:tensorflow:Issue encountered when serializing tft_mapper_use.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Counter' object has no attribute 'name'
WARNING:tensorflow:Issue encountered when serializing tft_mapper_use.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Counter' object has no attribute 'name'
INFO:tensorflow:SavedModel written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/24431b25f08249f1a4eea541cb0d444d/saved_model.pb
INFO:tensorflow:SavedModel written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/24431b25f08249f1a4eea541cb0d444d/saved_model.pb
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:No assets to write.
WARNING:tensorflow:Issue encountered when serializing tft_mapper_use.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Counter' object has no attribute 'name'
WARNING:tensorflow:Issue encountered when serializing tft_mapper_use.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'Counter' object has no attribute 'name'
INFO:tensorflow:SavedModel written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/18dd5e0a80444e9ca3095e0760e14b6d/saved_model.pb
INFO:tensorflow:SavedModel written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/18dd5e0a80444e9ca3095e0760e14b6d/saved_model.pb
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/ccbe8bbd5ff3478586f7a72150e18f21/saved_model.pb
INFO:tensorflow:SavedModel written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/ccbe8bbd5ff3478586f7a72150e18f21/saved_model.pb
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 4 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'updated_analyzer_cache': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/updated_analyzer_cache/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Transform:updated_analyzer_cache:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "TransformCache"
)], 'transform_graph': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/transform_graph/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Transform:transform_graph:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "TransformGraph"
)]}) for execution 4
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component Transform is finished.
INFO:absl:Component ExampleValidator is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.example_validator.component.ExampleValidator"
  }
  id: "ExampleValidator"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.ExampleValidator"
      }
    }
  }
}
inputs {
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "Importer.import_schema"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Importer.import_schema"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
    }
  }
  inputs {
    key: "statistics"
    value {
      channels {
        producer_node_query {
          id: "StatisticsGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.StatisticsGen"
            }
          }
        }
        artifact_query {
          type {
            name: "ExampleStatistics"
          }
        }
        output_key: "statistics"
      }
    }
  }
}
outputs {
  outputs {
    key: "anomalies"
    value {
      artifact_spec {
        type {
          name: "ExampleAnomalies"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "Importer.import_schema"
upstream_nodes: "StatisticsGen"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 5
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=5, input_dict={'statistics': [Artifact(artifact: id: 3
type_id: 10
uri: "pipelines/penguin-transform/StatisticsGen/statistics/3"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:StatisticsGen:statistics:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
create_time_since_epoch: 1619774402214
last_update_time_since_epoch: 1619774402214
, artifact_type: id: 10
name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'schema': [Artifact(artifact: id: 2
type_id: 8
uri: "schema"
state: LIVE
create_time_since_epoch: 1619774399033
last_update_time_since_epoch: 1619774399033
, artifact_type: id: 8
name: "Schema"
)]}, output_dict=defaultdict(<class 'list'>, {'anomalies': [Artifact(artifact: uri: "pipelines/penguin-transform/ExampleValidator/anomalies/5"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:ExampleValidator:anomalies:0"
  }
}
, artifact_type: name: "ExampleAnomalies"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}), exec_properties={'exclude_splits': '[]'}, execution_output_uri='pipelines/penguin-transform/ExampleValidator/.system/executor_execution/5/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/ExampleValidator/.system/stateful_working_dir/2021-04-30T09:19:53.944707', tmp_dir='pipelines/penguin-transform/ExampleValidator/.system/executor_execution/5/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.example_validator.component.ExampleValidator"
  }
  id: "ExampleValidator"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.ExampleValidator"
      }
    }
  }
}
inputs {
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "Importer.import_schema"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Importer.import_schema"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
    }
  }
  inputs {
    key: "statistics"
    value {
      channels {
        producer_node_query {
          id: "StatisticsGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.StatisticsGen"
            }
          }
        }
        artifact_query {
          type {
            name: "ExampleStatistics"
          }
        }
        output_key: "statistics"
      }
    }
  }
}
outputs {
  outputs {
    key: "anomalies"
    value {
      artifact_spec {
        type {
          name: "ExampleAnomalies"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "Importer.import_schema"
upstream_nodes: "StatisticsGen"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-04-30T09:19:53.944707')
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tfx to temp dir /tmp/tmpapy2aryk/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmpapy2aryk/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpapy2aryk/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmpapy2aryk/build/tfx/dist/tfx_ephemeral-0.29.0.tar.gz to beam args
INFO:absl:Validating schema against the computed statistics for split train.
INFO:absl:Validation complete for split train. Anomalies written to pipelines/penguin-transform/ExampleValidator/anomalies/5/Split-train.
INFO:absl:Validating schema against the computed statistics for split eval.
INFO:absl:Validation complete for split eval. Anomalies written to pipelines/penguin-transform/ExampleValidator/anomalies/5/Split-eval.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 5 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'anomalies': [Artifact(artifact: uri: "pipelines/penguin-transform/ExampleValidator/anomalies/5"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:ExampleValidator:anomalies:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "ExampleAnomalies"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}) for execution 5
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component ExampleValidator is finished.
INFO:absl:Component Trainer is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.trainer.component.Trainer"
  }
  id: "Trainer"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Trainer"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
    }
  }
  inputs {
    key: "transform_graph"
    value {
      channels {
        producer_node_query {
          id: "Transform"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Transform"
            }
          }
        }
        artifact_query {
          type {
            name: "TransformGraph"
          }
        }
        output_key: "transform_graph"
      }
    }
  }
}
outputs {
  outputs {
    key: "model"
    value {
      artifact_spec {
        type {
          name: "Model"
        }
      }
    }
  }
  outputs {
    key: "model_run"
    value {
      artifact_spec {
        type {
          name: "ModelRun"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "eval_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 5\n}"
      }
    }
  }
  parameters {
    key: "module_file"
    value {
      field_value {
        string_value: "penguin_utils.py"
      }
    }
  }
  parameters {
    key: "train_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 100\n}"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "Transform"
downstream_nodes: "Pusher"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 6
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=6, input_dict={'examples': [Artifact(artifact: id: 1
type_id: 6
uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1619774391,sum_checksum:1619774391"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "payload_format"
  value {
    string_value: "FORMAT_TF_EXAMPLE"
  }
}
custom_properties {
  key: "span"
  value {
    string_value: "0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
create_time_since_epoch: 1619774399009
last_update_time_since_epoch: 1619774399009
, artifact_type: id: 6
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)], 'transform_graph': [Artifact(artifact: id: 5
type_id: 13
uri: "pipelines/penguin-transform/Transform/transform_graph/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Transform:transform_graph:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
create_time_since_epoch: 1619774408460
last_update_time_since_epoch: 1619774408460
, artifact_type: id: 13
name: "TransformGraph"
)]}, output_dict=defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model_run/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Trainer:model_run:0"
  }
}
, artifact_type: name: "ModelRun"
)], 'model': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Trainer:model:0"
  }
}
, artifact_type: name: "Model"
)]}), exec_properties={'train_args': '{\n  "num_steps": 100\n}', 'eval_args': '{\n  "num_steps": 5\n}', 'module_file': 'penguin_utils.py', 'custom_config': 'null'}, execution_output_uri='pipelines/penguin-transform/Trainer/.system/executor_execution/6/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Trainer/.system/stateful_working_dir/2021-04-30T09:19:53.944707', tmp_dir='pipelines/penguin-transform/Trainer/.system/executor_execution/6/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.trainer.component.Trainer"
  }
  id: "Trainer"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Trainer"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
    }
  }
  inputs {
    key: "transform_graph"
    value {
      channels {
        producer_node_query {
          id: "Transform"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Transform"
            }
          }
        }
        artifact_query {
          type {
            name: "TransformGraph"
          }
        }
        output_key: "transform_graph"
      }
    }
  }
}
outputs {
  outputs {
    key: "model"
    value {
      artifact_spec {
        type {
          name: "Model"
        }
      }
    }
  }
  outputs {
    key: "model_run"
    value {
      artifact_spec {
        type {
          name: "ModelRun"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "eval_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 5\n}"
      }
    }
  }
  parameters {
    key: "module_file"
    value {
      field_value {
        string_value: "penguin_utils.py"
      }
    }
  }
  parameters {
    key: "train_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 100\n}"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "Transform"
downstream_nodes: "Pusher"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-04-30T09:19:53.944707')
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tfx to temp dir /tmp/tmp6yy7b02e/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmp6yy7b02e/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmp6yy7b02e/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmp6yy7b02e/build/tfx/dist/tfx_ephemeral-0.29.0.tar.gz to beam args
INFO:absl:Train on the 'train' split when train_args.splits is not set.
INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set.
INFO:absl:penguin_utils.py is already loaded, reloading
INFO:absl:Training model.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Model: "model"
INFO:absl:__________________________________________________________________________________________________
INFO:absl:Layer (type)                    Output Shape         Param #     Connected to                     
INFO:absl:==================================================================================================
INFO:absl:culmen_length_mm_xf (InputLayer [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:culmen_depth_mm_xf (InputLayer) [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:flipper_length_mm_xf (InputLaye [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:body_mass_g_xf (InputLayer)     [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:concatenate (Concatenate)       (None, 4)            0           culmen_length_mm_xf[0][0]        
INFO:absl:                                                                 culmen_depth_mm_xf[0][0]         
INFO:absl:                                                                 flipper_length_mm_xf[0][0]       
INFO:absl:                                                                 body_mass_g_xf[0][0]             
INFO:absl:__________________________________________________________________________________________________
INFO:absl:dense (Dense)                   (None, 8)            40          concatenate[0][0]                
INFO:absl:__________________________________________________________________________________________________
INFO:absl:dense_1 (Dense)                 (None, 8)            72          dense[0][0]                      
INFO:absl:__________________________________________________________________________________________________
INFO:absl:dense_2 (Dense)                 (None, 3)            27          dense_1[0][0]                    
INFO:absl:==================================================================================================
INFO:absl:Total params: 139
INFO:absl:Trainable params: 139
INFO:absl:Non-trainable params: 0
INFO:absl:__________________________________________________________________________________________________
100/100 [==============================] - 1s 7ms/step - loss: 0.6757 - sparse_categorical_accuracy: 0.6825 - val_loss: 0.0270 - val_sparse_categorical_accuracy: 1.0000
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Trainer/model/6/Format-Serving/assets
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Trainer/model/6/Format-Serving/assets
INFO:absl:Training complete. Model written to pipelines/penguin-transform/Trainer/model/6/Format-Serving. ModelRun written to pipelines/penguin-transform/Trainer/model_run/6
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 6 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model_run/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Trainer:model_run:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "ModelRun"
)], 'model': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Trainer:model:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "Model"
)]}) for execution 6
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component Trainer is finished.
INFO:absl:Component Pusher is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.pusher.component.Pusher"
  }
  id: "Pusher"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Pusher"
      }
    }
  }
}
inputs {
  inputs {
    key: "model"
    value {
      channels {
        producer_node_query {
          id: "Trainer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Trainer"
            }
          }
        }
        artifact_query {
          type {
            name: "Model"
          }
        }
        output_key: "model"
      }
    }
  }
}
outputs {
  outputs {
    key: "pushed_model"
    value {
      artifact_spec {
        type {
          name: "PushedModel"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "push_destination"
    value {
      field_value {
        string_value: "{\n  \"filesystem\": {\n    \"base_directory\": \"serving_model/penguin-transform\"\n  }\n}"
      }
    }
  }
}
upstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 7
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=7, input_dict={'model': [Artifact(artifact: id: 8
type_id: 18
uri: "pipelines/penguin-transform/Trainer/model/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Trainer:model:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
state: LIVE
create_time_since_epoch: 1619774415072
last_update_time_since_epoch: 1619774415072
, artifact_type: id: 18
name: "Model"
)]}, output_dict=defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-transform/Pusher/pushed_model/7"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Pusher:pushed_model:0"
  }
}
, artifact_type: name: "PushedModel"
)]}), exec_properties={'push_destination': '{\n  "filesystem": {\n    "base_directory": "serving_model/penguin-transform"\n  }\n}', 'custom_config': 'null'}, execution_output_uri='pipelines/penguin-transform/Pusher/.system/executor_execution/7/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Pusher/.system/stateful_working_dir/2021-04-30T09:19:53.944707', tmp_dir='pipelines/penguin-transform/Pusher/.system/executor_execution/7/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.pusher.component.Pusher"
  }
  id: "Pusher"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-04-30T09:19:53.944707"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Pusher"
      }
    }
  }
}
inputs {
  inputs {
    key: "model"
    value {
      channels {
        producer_node_query {
          id: "Trainer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-04-30T09:19:53.944707"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Trainer"
            }
          }
        }
        artifact_query {
          type {
            name: "Model"
          }
        }
        output_key: "model"
      }
    }
  }
}
outputs {
  outputs {
    key: "pushed_model"
    value {
      artifact_spec {
        type {
          name: "PushedModel"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "push_destination"
    value {
      field_value {
        string_value: "{\n  \"filesystem\": {\n    \"base_directory\": \"serving_model/penguin-transform\"\n  }\n}"
      }
    }
  }
}
upstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-04-30T09:19:53.944707')
WARNING:apache_beam.options.pipeline_options:Discarding unparseable args: ['-f', '/tmp/tmpdlzuirfc.json', '--HistoryManager.hist_file=:memory:']
INFO:absl:Attempting to infer TFX Python dependency for beam
INFO:absl:Copying all content from install dir /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tfx to temp dir /tmp/tmpnwz57ica/build/tfx
INFO:absl:Generating a temp setup file at /tmp/tmpnwz57ica/build/tfx/setup.py
INFO:absl:Creating temporary sdist package, logs available at /tmp/tmpnwz57ica/build/tfx/setup.log
INFO:absl:Added --extra_package=/tmp/tmpnwz57ica/build/tfx/dist/tfx_ephemeral-0.29.0.tar.gz to beam args
WARNING:absl:Pusher is going to push the model without validation. Consider using Evaluator or InfraValidator in your pipeline.
INFO:absl:Model version: 1619774416
INFO:absl:Model written to serving path serving_model/penguin-transform/1619774416.
INFO:absl:Model pushed to pipelines/penguin-transform/Pusher/pushed_model/7.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 7 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-transform/Pusher/pushed_model/7"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-04-30T09:19:53.944707:Pusher:pushed_model:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "0.29.0"
  }
}
, artifact_type: name: "PushedModel"
)]}) for execution 7
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component Pusher is finished.

You should see "INFO:absl:Component Pusher is finished." if the pipeline finished successfully.

The pusher component pushes the trained model to the SERVING_MODEL_DIR which is the serving_model/penguin-transform directory if you did not change the variables in the previous steps. You can see the result from the file browser in the left-side panel in Colab, or using the following command:

# List files in created model directory.
find {SERVING_MODEL_DIR}
serving_model/penguin-transform
serving_model/penguin-transform/1619774416
serving_model/penguin-transform/1619774416/assets
serving_model/penguin-transform/1619774416/variables
serving_model/penguin-transform/1619774416/variables/variables.data-00000-of-00001
serving_model/penguin-transform/1619774416/variables/variables.index
serving_model/penguin-transform/1619774416/saved_model.pb

You can also check the signature of the generated model using the saved_model_cli tool.

saved_model_cli show --dir {SERVING_MODEL_DIR}/$(ls -1 {SERVING_MODEL_DIR} | sort -nr | head -1) --tag_set serve --signature_def serving_default
2021-04-30 09:20:16.594948: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
The given SavedModel SignatureDef contains the following input(s):
  inputs['examples'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_examples:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_0'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 3)
      name: StatefulPartitionedCall_1:0
Method name is: tensorflow/serving/predict

Because we defined serving_default with our own serve_tf_examples_fn function, the signature shows that it takes a single string. This string is a serialized string of tf.Examples and will be parsed with the tf.io.parse_example() function as we defined earlier.

We can load the exported model and try some inferences with a few examples.

# Find a model with the latest timestamp.
model_dirs = (item for item in os.scandir(SERVING_MODEL_DIR) if item.is_dir())
model_path = max(model_dirs, key=lambda i: int(i.name)).path

loaded_model = tf.keras.models.load_model(model_path)
inference_fn = loaded_model.signatures['serving_default']
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<tensorflow.python.keras.saving.saved_model.load.TensorFlowTransform>TransformFeaturesLayer object at 0x7f6e902c24a8> and <tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7f6eaab0dc88>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<tensorflow.python.keras.saving.saved_model.load.TensorFlowTransform>TransformFeaturesLayer object at 0x7f6e902c24a8> and <tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7f6eaab0dc88>).
# Prepare an example and run inference.
features = {
  'culmen_length_mm': tf.train.Feature(float_list=tf.train.FloatList(value=[49.9])),
  'culmen_depth_mm': tf.train.Feature(float_list=tf.train.FloatList(value=[16.1])),
  'flipper_length_mm': tf.train.Feature(int64_list=tf.train.Int64List(value=[213])),
  'body_mass_g': tf.train.Feature(int64_list=tf.train.Int64List(value=[5400])),
}
example_proto = tf.train.Example(features=tf.train.Features(feature=features))
examples = example_proto.SerializeToString()

result = inference_fn(examples=tf.constant([examples]))
print(result['output_0'].numpy())
[[-6.2080894 -2.9893324  4.0824165]]

The third element, which corresponds to 'Gentoo' species, is expected to be the largest among three.

Next steps

If you want to learn more about Transform component, see Transform Component guide. You can find more resources on https://www.tensorflow.org/tfx/tutorials

Please see Understanding TFX Pipelines to learn more about various concepts in TFX.