![]() |
Creates a pipeline step that launches a AIP training job.
tfx.orchestration.kubeflow.v2.components.experimental.ai_platform_training_component.create_ai_platform_training(
name: Text,
project_id: Text,
region: Optional[Text] = None,
job_id: Optional[Text] = None,
image_uri: Optional[Text] = None,
args: Optional[List[placeholders.CommandlineArgumentType]] = None,
scale_tier: Optional[Text] = None,
training_input: Optional[Dict[Text, Any]] = None,
labels: Optional[Dict[Text, Text]] = None,
inputs: Dict[Text, Any] = None,
outputs: Dict[Text, Any] = None,
parameters: Dict[Text, Any] = None
) -> tfx.dsl.components.base.base_component.BaseComponent
The generated TFX component will have a component spec specified dynamically, through inputs/outputs/parameters in the following format:
- inputs: A mapping from input name to the upstream channel connected. The artifact type of the channel will be automatically inferred.
- outputs: A mapping from output name to the associated artifact type.
- parameters: A mapping from execution property names to its associated value. Only primitive typed values are supported. Note that RuntimeParameter is not supported yet.
For example:
create_ai_platform_training(
...
inputs: {
# Assuming there is an upstream node example_gen, with an output
# 'examples' of the type Examples.
'examples': example_gen.outputs['examples'],
},
outputs: {
'model': standard_artifacts.Model,
},
parameters: {
'n_steps': 100,
'optimizer': 'sgd',
}
...
)
will generate a component instance with a component spec equivalent to:
class MyComponentSpec(ComponentSpec):
INPUTS = {
'examples': ChannelParameter(type=standard_artifacts.Examples)
}
OUTPUTS = {
'model': ChannelParameter(type=standard_artifacts.Model)
}
PARAMETERS = {
'n_steps': ExecutionParameter(type=int),
'optimizer': ExecutionParameter(type=str)
}
with its input 'examples' is connected to the example_gen output, and execution properties specified as 100 and 'sgd' respectively.
Example usage of the component:
# A single node training job.
my_train = create_ai_platform_training(
name='my_training_step',
project_id='my-project',
region='us-central1',
image_uri='gcr.io/my-project/caip-training-test:latest',
'args': [
'--examples',
placeholders.InputUriPlaceholder('examples'),
'--n-steps',
placeholders.InputValuePlaceholder('n_step'),
'--output-location',
placeholders.OutputUriPlaceholder('model')
]
scale_tier='BASIC_GPU',
inputs={'examples': example_gen.outputs['examples']},
outputs={
'model': standard_artifacts.Model
},
parameters={'n_step': 100}
)
# More complex setting can be expressed by providing training_input
# directly.
my_distributed_train = create_ai_platform_training(
name='my_training_step',
project_id='my-project',
training_input={
'scaleTier':
'CUSTOM',
'region':
'us-central1',
'masterType': 'n1-standard-8',
'masterConfig': {
'imageUri': 'gcr.io/my-project/my-dist-training:latest'
},
'workerType': 'n1-standard-8',
'workerCount': 8,
'workerConfig': {
'imageUri': 'gcr.io/my-project/my-dist-training:latest'
},
'args': [
'--examples',
placeholders.InputUriPlaceholder('examples'),
'--n-steps',
placeholders.InputValuePlaceholder('n_step'),
'--output-location',
placeholders.OutputUriPlaceholder('model')
]
},
inputs={'examples': example_gen.outputs['examples']},
outputs={'model': Model},
parameters={'n_step': 100}
)
Args | |
---|---|
name
|
name of the component. This is needed to construct the component spec and component class dynamically as well. |
project_id
|
the GCP project under which the AIP training job will be running. |
region
|
GCE region where the AIP training job will be running. |
job_id
|
the unique ID of the job. Default to 'tfx_%Y%m%d%H%M%S' |
image_uri
|
the GCR location of the container image, which will be used to execute the training program. If the same field is specified in training_input, the latter overrides image_uri. |
args
|
command line arguments that will be passed into the training program. Users can use placeholder semantics as in tfx.dsl.component.experimental.container_component to wire the args with component inputs/outputs/parameters. |
scale_tier
|
Cloud ML resource requested by the job. See https://cloud.google.com/ai-platform/training/docs/reference/rest/v1/projects.jobs#ScaleTier |
training_input
|
full training job spec. This field overrides other specifications if applicable. This field follows the TrainingInput schema. |
labels
|
user-specified label attached to the job. |
inputs
|
the dict of component inputs. |
outputs
|
the dict of component outputs. |
parameters
|
the dict of component parameters, aka, execution properties. |
Returns | |
---|---|
A component instance that represents the AIP job in the DSL. |
Raises | |
---|---|
ValueError
|
when image_uri is missing and masterConfig is not specified in training_input, or when region is missing and training_input does not provide region either. |
TypeError
|
when non-primitive parameters are specified. |