tfx.components.experimental.data_view.provider_component.TfGraphDataViewProvider

A component providing a tfx_bsl.coders.TfGraphRecordDecoder as a DataView.

Inherits From: BaseComponent

User needs to define a function that creates such a TfGraphRecordDecoder. This component, when running, calls that function and writes the result decoder (in the form of a TF SavedModel) as its output artifact.

Example:

  # Import a decoder that can be created by a function 'create_decoder()' in
  # module_file:
  data_view_provider = TfGraphDataViewProvider(
      module_file=module_file,
      create_decoder_func='create_decoder')

create_decoder_func If module_file is not None, this should be the name of the function in module_file that this component need to use to create the TfGraphRecordDecoder. Otherwise it should be the path (dot-delimited, e.g. "some_package.some_module.some_func") to such a function. The function must have the following signature:

def create_decoder_func() -> tfx_bsl.coder.TfGraphRecordDecoder: ...

module_file The file path to a python module file, from which the function named after create_decoder_func will be loaded. If not provided, create_decoder_func is expected to be a path to a function.
data_view Output 'DataView' channel, in which a the decoder will be saved.
instance_name Optional unique instance name. Necessary iff multiple transform components are declared in the same pipeline.

component_id DEPRECATED FUNCTION

component_type DEPRECATED FUNCTION
downstream_nodes

exec_properties

id Node id, unique across all TFX nodes in a pipeline.

If instance name is available, node_id will be: . otherwise, node_id will be:

inputs

outputs

type

upstream_nodes

Child Classes

class DRIVER_CLASS

class SPEC_CLASS

Methods

add_downstream_node

View source

Experimental: Add another component that must run after this one.

This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.

Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.

It is symmetric with add_upstream_node.

Args
downstream_node a component that must run after this node.

add_upstream_node

View source

Experimental: Add another component that must run before this one.

This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.

Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.

It is symmetric with add_downstream_node.

Args
upstream_node a component that must run before this node.

from_json_dict

View source

Convert from dictionary data to an object.

get_id

View source

Gets the id of a node.

This can be used during pipeline authoring time. For example: from tfx.components import Trainer

resolver = ResolverNode(..., model=Channel( type=Model, producer_component_id=Trainer.get_id('my_trainer')))

Args
instance_name (Optional) instance name of a node. If given, the instance name will be taken into consideration when generating the id.

Returns
an id for the node.

to_json_dict

View source

Convert from an object to a JSON serializable dictionary.

EXECUTOR_SPEC tfx.components.base.executor_spec.ExecutorClassSpec