Cloud BigQueryExampleGen component.
tfx.v1.extensions.google_cloud_big_query.BigQueryExampleGen(
query: Optional[str] = None,
input_config: Optional[Union[tfx.v1.proto.Input
, tfx.v1.dsl.experimental.RuntimeParameter
]] = None,
output_config: Optional[Union[tfx.v1.proto.Output
, tfx.v1.dsl.experimental.RuntimeParameter
]] = None,
range_config: Optional[Union[tfx.v1.proto.RangeConfig
, tfx.v1.dsl.experimental.RuntimeParameter
]] = None
)
Used in the notebooks
The BigQuery examplegen component takes a query, and generates train
and eval examples for downstream components.
Component outputs
contains:
Args |
query
|
BigQuery sql string, query result will be treated as a single
split, can be overwritten by input_config.
|
input_config
|
An example_gen_pb2.Input instance with Split.pattern as
BigQuery sql string. If set, it overwrites the 'query' arg, and allows
different queries per split. If any field is provided as a
RuntimeParameter, input_config should be constructed as a dict with the
same field names as Input proto message.
|
output_config
|
An example_gen_pb2.Output instance, providing output
configuration. If unset, default splits will be 'train' and 'eval' with
size 2:1. If any field is provided as a RuntimeParameter,
input_config should be constructed as a dict with the same field names
as Output proto message.
|
range_config
|
An optional range_config_pb2.RangeConfig instance,
specifying the range of span values to consider.
|
Raises |
RuntimeError
|
Only one of query and input_config should be set.
|
Attributes |
outputs
|
Component's output channel dict.
|
Methods
with_beam_pipeline_args
with_beam_pipeline_args(
beam_pipeline_args: Iterable[Union[str, placeholder.Placeholder]]
) -> 'BaseBeamComponent'
Add per component Beam pipeline args.
Args |
beam_pipeline_args
|
List of Beam pipeline args to be added to the Beam
executor spec.
|
Returns |
the same component itself.
|