tfx.v1.extensions.google_cloud_big_query.BigQueryExampleGen

Cloud BigQueryExampleGen component.

Inherits From: BaseComponent, BaseNode

Used in the notebooks

Used in the tutorials

The BigQuery examplegen component takes a query, and generates train and eval examples for downstream components.

Component outputs contains:

query BigQuery sql string, query result will be treated as a single split, can be overwritten by input_config.
input_config An example_gen_pb2.Input instance with Split.pattern as BigQuery sql string. If set, it overwrites the 'query' arg, and allows different queries per split. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Input proto message.
output_config An example_gen_pb2.Output instance, providing output configuration. If unset, default splits will be 'train' and 'eval' with size 2:1. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Output proto message.
range_config An optional range_config_pb2.RangeConfig instance, specifying the range of span values to consider.
custom_executor_spec Optional custom executor spec overriding the default executor spec specified in the component attribute.
custom_config An example_gen_pb2.CustomConfig instance, providing custom configuration for ExampleGen.

RuntimeError Only one of query and input_config should be set.

outputs Component's output channel dict.

Methods

with_beam_pipeline_args

Add per component Beam pipeline args.

Args
beam_pipeline_args List of Beam pipeline args to be added to the Beam executor spec.

Returns
the same component itself.