[null,null,["อัปเดตล่าสุด 2025-07-25 UTC"],[],[],null,["# Building TFX pipelines\n\n\u003cbr /\u003e\n\n| **Note:** For a conceptual view of TFX Pipelines, see [Understanding TFX Pipelines](/tfx/guide/understanding_tfx_pipelines).\n| **Note:** Want to build your first pipeline before you dive into the details? Get started [building a pipeline using a template](https://www.tensorflow.org/tfx/guide/build_local_pipeline#build_a_pipeline_using_a_template).\n\nUsing the `Pipeline` class\n--------------------------\n\nTFX pipelines are defined using the\n[`Pipeline` class](https://github.com/tensorflow/tfx/blob/master/tfx/orchestration/pipeline.py).\nThe following example demonstrates how to use the `Pipeline` class. \n\n```scdoc\npipeline.Pipeline(\n pipeline_name=pipeline-name,\n pipeline_root=pipeline-root,\n components=components,\n enable_cache=enable-cache,\n metadata_connection_config=metadata-connection-config,\n)\n```\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003epipeline-name\u003c/var\u003e: The name of this pipeline. The pipeline name must\n be unique.\n\n TFX uses the pipeline name when querying ML Metadata for component input\n artifacts. Reusing a pipeline name may result in unexpected behaviors.\n- \u003cvar translate=\"no\"\u003epipeline-root\u003c/var\u003e: The root path of this pipeline's outputs. The root\n path must be the full path to a directory that your orchestrator has read\n and write access to. At runtime, TFX uses the pipeline root to generate\n output paths for component artifacts. This directory can be local, or on a\n supported distributed file system, such as Google Cloud Storage or HDFS.\n\n- \u003cvar translate=\"no\"\u003ecomponents\u003c/var\u003e: A list of component instances that make up this\n pipeline's workflow.\n\n- \u003cvar translate=\"no\"\u003eenable-cache\u003c/var\u003e: (Optional.) A boolean value that indicates if this\n pipeline uses caching to speed up pipeline execution.\n\n- \u003cvar translate=\"no\"\u003emetadata-connection-config\u003c/var\u003e: (Optional.) A connection\n configuration for ML Metadata.\n\nDefining the component execution graph\n--------------------------------------\n\nComponent instances produce artifacts as outputs and typically depend on\nartifacts produced by upstream component instances as inputs. The execution\nsequence for component instances is determined by creating a directed acyclic\ngraph (DAG) of the artifact dependencies.\n\nFor instance, the `ExampleGen` standard component can ingest data from a CSV\nfile and output serialized example records. The `StatisticsGen` standard\ncomponent accepts these example records as input and produces dataset\nstatistics. In this example, the instance of `StatisticsGen` must follow\n`ExampleGen` because `SchemaGen` depends on the output of `ExampleGen`.\n\n### Task-based dependencies\n\n| **Note:** Using task-based dependencies is typically not recommended. Defining the execution graph with artifact dependencies lets you take advantage of the automatic artifact lineage tracking and caching features of TFX.\n\nYou can also define task-based dependencies using your component's\n[`add_upstream_node` and `add_downstream_node`](https://github.com/tensorflow/tfx/blob/master/tfx/components/base/base_node.py)\nmethods. `add_upstream_node` lets you specify that the current component must be\nexecuted after the specified component. `add_downstream_node` lets you specify\nthat the current component must be executed before the specified component.\n\nPipeline templates\n------------------\n\nThe easiest way to get a pipeline set up quickly, and to see how all the pieces\nfit together, is to use a template. Using templates is covered in [Building a\nTFX Pipeline Locally](/tfx/guide/build_local_pipeline).\n\nCaching\n-------\n\nTFX pipeline caching lets your pipeline skip over components that have been\nexecuted with the same set of inputs in a previous pipeline run. If caching is\nenabled, the pipeline attempts to match the signature of each component, the\ncomponent and set of inputs, to one of this pipeline's previous component\nexecutions. If there is a match, the pipeline uses the component outputs from\nthe previous run. If there is not a match, the component is executed.\n\nDo not use caching if your pipeline uses non-deterministic components. For\nexample, if you create a component to create a random number for your pipeline,\nenabling the cache causes this component to execute once. In this example,\nsubsequent runs use the first run's random number instead of generating a random\nnumber."]]