mlmd.MetadataStore

A store for the metadata.

config proto.ConnectionConfig or proto.MetadataStoreClientConfig. Configuration to connect to the database or the metadata store server.
enable_upgrade_migration if set to True, the library upgrades the db schema and migrates all data if it connects to an old version backend. It is ignored when using gRPC proto.MetadataStoreClientConfig.

Methods

get_artifact_by_type_and_name

View source

Get the artifact of the given type and name.

The API fails if more than one artifact is found.

Args
type_name The artifact type name to look for.
artifact_name The artifact name to look for.
type_version An optional artifact type version. If not given, then only the type_name and artifact_name are used to look for the artifact with default version.

Returns
The Artifact matching the type and name. None if no matched Artifact was found.

get_artifact_type

View source

Gets an artifact type by name and version.

Args
type_name the type with that name.
type_version an optional version of the type, if not given, then only the type_name is used to look for types with no versions.

Returns
The type with name type_name and version type version.

Raises
errors.NotFoundError if no type exists.
errors.InternalError if query execution fails.

get_artifact_types

View source

Gets all artifact types.

Returns
A list of all known ArtifactTypes.

Raises
errors.InternalError if query execution fails.

get_artifact_types_by_external_ids

View source

Gets all artifact types with matching external ids.

Args
external_ids A list of external_ids for retrieving the ArtifactTypes.

Returns
ArtifactTypes with matching external_ids.

get_artifact_types_by_id

View source

Gets artifact types by ID.

Args
type_ids a sequence of artifact type IDs.

Returns
A list of artifact types.

Raises
errors.InternalError if query execution fails.

get_artifacts

View source

Gets artifacts.

Args
list_options A set of options to specify the conditions, limit the size and adjust order of the returned artifacts.

Returns
A list of artifacts.

Raises
errors.InternalError if query execution fails.
errors.InvalidArgument if list_options is invalid.

get_artifacts_and_types_by_artifact_ids

View source

Gets all artifacts with matching ids and populates types.

The result is not index-aligned: if an id is not found, it is not returned.

Args
artifact_ids A list of artifact ids to retrieve.

Returns
Artifacts with matching ids and ArtifactTypes which can be matched by type_ids from Artifacts. Each ArtifactType contains id, name, properties and custom_properties fields.

get_artifacts_by_context

View source

Gets all direct artifacts that are attributed to the given context.

Args
context_id The id of the querying context
list_options A set of options to specify the conditions, limit the size and adjust order of the returned executions.

Returns
Artifacts attributing to the context.

get_artifacts_by_external_ids

View source

Gets all artifacts with matching external ids.

Args
external_ids A list of external_ids for retrieving the Artifacts.

Returns
Artifacts with matching external_ids.

get_artifacts_by_id

View source

Gets all artifacts with matching ids.

The result is not index-aligned: if an id is not found, it is not returned.

Args
artifact_ids A list of artifact ids to retrieve.

Returns
Artifacts with matching ids.

get_artifacts_by_type

View source

Gets all the artifacts of a given type.

Args
type_name The artifact type name to look for.
type_version An optional artifact type version. If not given, then only the type_name are used to look for the artifacts with default version.

Returns
Artifacts that matches the type.

get_artifacts_by_uri

View source

Gets all the artifacts of a given uri.

Args
uri The artifact uri to look for.

Returns
The Artifacts matching the uri.

get_children_contexts_by_context

View source

Gets all children contexts of a context.

Args
context_id The id of the querying context.

Returns
Children contexts of the querying context.

Raises
errors.InternalError if query execution fails.

get_context_by_type_and_name

View source

Get the context of the given type and context name.

The API fails if more than one contexts are found.

Args
type_name The context type name to look for.
context_name The context name to look for.
type_version An optional context type version. If not given, then only the type_name and context_name are used to look for the context with default version.

Returns
The Context matching the type and context name. None if no matched Context found.

get_context_type

View source

Gets a context type by name and version.

Args
type_name the type with that name.
type_version an optional version of the type, if not given, then only the type_name is used to look for types with no versions.

Returns
The type with name type_name and version type_version.

Raises
errors.NotFoundError if no type exists.
errors.InternalError if query execution fails.

get_context_types

View source

Gets all context types.

Returns
A list of all known ContextTypes.

Raises
errors.InternalError if query execution fails.

get_context_types_by_external_ids

View source

Gets all context types with matching external ids.

Args
external_ids A list of external_ids for retrieving the ContextTypes.

Returns
ContextTypes with matching external_ids.

get_context_types_by_id

View source

Gets context types by ID.

Args
type_ids a sequence of context type IDs.

Returns
A list of context types.

Args
type_ids ids to look for.

Raises
errors.InternalError if query execution fails.

get_contexts

View source

Gets contexts.

Args
list_options A set of options to specify the conditions, limit the size and adjust order of the returned contexts.
extra_options ExtraOptions instance.

Returns
A list of contexts.

Raises
errors.InternalError if query execution fails.
errors.InvalidArgument if list_options is invalid.

get_contexts_by_artifact

View source

Gets all context that an artifact is attributed to.

Args
artifact_id The id of the querying artifact

Returns
Contexts that the artifact is attributed to.

get_contexts_by_execution

View source

Gets all context that an execution is associated with.

Args
execution_id The id of the querying execution

Returns
Contexts that the execution is associated with.

get_contexts_by_external_ids

View source

Gets all contexts with matching external ids.

Args
external_ids A list of external_ids for retrieving the Contexts.

Returns
Contexts with matching external_ids.

get_contexts_by_id

View source

Gets all contexts with matching ids.

The result is not index-aligned: if an id is not found, it is not returned.

Args
context_ids A list of context ids to retrieve.

Returns
Contexts with matching ids.

get_contexts_by_type

View source

Gets all the contexts of a given type.

Args
type_name The context type name to look for.
type_version An optional context type version. If not given, then only the type_name are used to look for the contexts with default version.

Returns
Contexts that matches the type.

get_events_by_artifact_ids

View source

Gets all events with matching artifact ids.

Args
artifact_ids a list of artifact ids.

Returns
Events with the execution IDs given.

Raises
errors.InternalError if query execution fails.

get_events_by_execution_ids

View source

Gets all events with matching execution ids.

Args
execution_ids a list of execution ids.

Returns
Events with the execution IDs given.

Raises
errors.InternalError if query execution fails.

get_execution_by_type_and_name

View source

Get the execution of the given type and name.

The API fails if more than one execution is found.

Args
type_name The execution type name to look for.
execution_name The execution name to look for.
type_version An optional execution type version. If not given, then only the type_name and execution_name are used to look for the execution with default version.

Returns
The Execution matching the type and name. None if no matched Execution found.

get_execution_type

View source

Gets an execution type by name and version.

Args
type_name the type with that name.
type_version an optional version of the type, if not given, then only the type_name is used to look for types with no versions.

Returns
The type with name type_name and version type_version.

Raises
errors.NotFoundError if no type exists.
errors.InternalError if query execution fails.

get_execution_types

View source

Gets all execution types.

Returns
A list of all known ExecutionTypes.

Raises
errors.InternalError if query execution fails.

get_execution_types_by_external_ids

View source

Gets all execution types with matching external ids.

Args
external_ids A list of external_ids for retrieving the ExecutionTypes.

Returns
ExecutionTypes with matching external_ids.

get_execution_types_by_id

View source

Gets execution types by ID.

Args
type_ids a sequence of execution type IDs.

Returns
A list of execution types.

Args
type_ids ids to look for.

Raises
errors.InternalError if query execution fails.

get_executions

View source

Gets executions.

Args
list_options A set of options to specify the conditions, limit the size and adjust order of the returned executions.

Returns
A list of executions.

Raises
errors.InternalError if query execution fails.
errors.InvalidArgument if list_options is invalid.

get_executions_by_context

View source

Gets all direct executions that a context associates with.

Args
context_id The id of the querying context
list_options A set of options to specify the conditions, limit the size and adjust order of the returned executions.

Returns
Executions associating with the context.

get_executions_by_external_ids

View source

Gets all executions with matching external ids.

Args
external_ids A list of external_ids for retrieving the Executions.

Returns
Executions with matching external_ids.

get_executions_by_id

View source

Gets all executions with matching ids.

The result is not index-aligned: if an id is not found, it is not returned.

Args
execution_ids A list of execution ids to retrieve.

Returns
Executions with matching ids.

get_executions_by_type

View source

Gets all the executions of a given type.

Args
type_name The execution type name to look for.
type_version An optional execution type version. If not given, then only the type_name are used to look for the executions with default version.

Returns
Executions that matches the type.

get_lineage_subgraph

View source

Gets lineage graph including fields specified in a field mask.

Args
query_options metadata_store_pb2.LineageSubgraphQueryOptions object. It allows users to specify query options for lineage graph tracing from a list of interested nodes (limited to 100). Please refer to LineageSubgraphQueryOptions for more details.
field_mask_paths a list of user specified paths of fields that should be included in the returned lineage graph. If field_mask_paths is specified and non-empty:

  1. If 'artifacts', 'executions', or 'contexts' is specified in read_mask, the nodes with details will be included.
  2. If 'artifact_types', 'execution_types', or 'context_types' is specified in read_mask, all the node types with matched type_id in nodes in the returned graph will be included.
  3. If 'events' is specified in read_mask, the events will be included. the returned graph. If field_mask_paths is unspecified or is empty, it will return all the fields in the returned graph.

Returns
metadata_store_pb2.LineageGraph object that contains the lineage graph.

get_parent_contexts_by_context

View source

Gets all parent contexts of a context.

Args
context_id The id of the querying context.

Returns
Parent contexts of the querying context.

Raises
errors.InternalError if query execution fails.

put_artifact_type

View source

Inserts or updates an artifact type.

A type has a set of strong typed properties describing the schema of any stored instance associated with that type. A type is identified by a name and an optional version.

Type Creation:

If no type exists in the database with the given identifier (name, version), it creates a new type and returns the type_id.

Type Evolution:

If the request type with the same (name, version) already exists (let's call it stored_type), the method enforces the stored_type can be updated only when the request type is backward compatible for the already stored instances.

Backwards compatibility is violated iff:

  1. there is a property where the request type and stored_type have different value type (e.g., int vs. string)
  2. can_add_fields = false and the request type has a new property that is not stored.
  3. can_omit_fields = false and stored_type has an existing property that is not provided in the request type.

If non-backward type change is required in the application, e.g., deprecate properties, re-purpose property name, change value types, a new type can be created with a different (name, version) identifier. Note the type version is optional, and a version value with empty string is treated as unset.

Args
artifact_type the request type to be inserted or updated.
can_add_fields when true, new properties can be added; when false, returns ALREADY_EXISTS if the request type has properties that are not in stored_type.
can_omit_fields when true, stored properties can be omitted in the request type; when false, returns ALREADY_EXISTS if the stored_type has properties not in the request type.

Returns
the type_id of the response.

Raises
errors.AlreadyExistsError If the type is not backward compatible.
errors.InvalidArgumentError If the request type has no name, or any property value type is unknown.

put_artifacts

View source

Inserts or updates artifacts in the database.

If an artifact id is specified for an artifact, it is an update. If an artifact id is unspecified, it will insert a new artifact. For new artifacts, type must be specified. For old artifacts, type must be unchanged or unspecified. When the name of an artifact is given, it should be unique among artifacts of the same ArtifactType.

It is not guaranteed that the created or updated artifacts will share the same create_time_since_epoch or last_update_time_since_epoch timestamps.

If field_mask_paths is specified and non-empty:

  1. while updating an existing artifact, it only updates fields specified in field_mask_paths.
  2. while inserting a new artifact, field_mask_paths will be ignored.
  3. otherwise, field_mask_paths will be applied to all artifacts. If field_mask_paths is unspecified or is empty, it updates the artifact as a whole.

Args
artifacts A list of artifacts to insert or update.
field_mask_paths A list of field mask paths for masked update.

Returns
A list of artifact ids index-aligned with the input.

Raises
errors.AlreadyExistsError If artifact's name is specified and it is already used by stored artifacts of that ArtifactType.

put_attributions_and_associations

View source

Inserts attribution and association relationships in the database.

The context_id, artifact_id, and execution_id must already exist. If the relationship exists, this call does nothing. Once added, the relationships cannot be modified.

Args
attributions A list of attributions to insert.
associations A list of associations to insert.

put_context_type

View source

Inserts or updates a context type.

A type has a set of strong typed properties describing the schema of any stored instance associated with that type. A type is identified by a name and an optional version.

Type Creation:

If no type exists in the database with the given identifier (name, version), it creates a new type and returns the type_id.

Type Evolution:

If the request type with the same (name, version) already exists (let's call it stored_type), the method enforces the stored_type can be updated only when the request type is backward compatible for the already stored instances.

Backwards compatibility is violated iff:

  1. there is a property where the request type and stored_type have different value type (e.g., int vs. string)
  2. can_add_fields = false and the request type has a new property that is not stored.
  3. can_omit_fields = false and stored_type has an existing property that is not provided in the request type.

If non-backward type change is required in the application, e.g., deprecate properties, re-purpose property name, change value types, a new type can be created with a different (name, version) identifier. Note the type version is optional, and a version value with empty string is treated as unset.

Args
context_type the request type to be inserted or updated.
can_add_fields when true, new properties can be added; when false, returns ALREADY_EXISTS if the request type has properties that are not in stored_type.
can_omit_fields when true, stored properties can be omitted in the request type; when false, returns ALREADY_EXISTS if the stored_type has properties not in the request type.

Returns
the type_id of the response.

Raises
errors.AlreadyExistsError If the type is not backward compatible.
errors.InvalidArgumentError If the request type has no name, or any property value type is unknown.

put_contexts

View source

Inserts or updates contexts in the database.

If an context id is specified for an context, it is an update. If an context id is unspecified, it will insert a new context. For new contexts, type must be specified. For old contexts, type must be unchanged or unspecified. The name of a context cannot be empty, and it should be unique among contexts of the same ContextType.

It is not guaranteed that the created or updated contexts will share the same create_time_since_epoch or last_update_time_since_epoch timestamps.

If field_mask_paths is specified and non-empty:

  1. while updating an existing context, it only updates fields specified in field_mask_paths.
  2. while inserting a new context, field_mask_paths will be ignored.
  3. otherwise, field_mask_paths will be applied to all contexts. If field_mask_paths is unspecified or is empty, it updates the context as a whole.

Args
contexts A list of contexts to insert or update.
field_mask_paths A list of field mask paths for masked update.

Returns
A list of context ids index-aligned with the input.

Raises
errors.InvalidArgumentError If name of the new contexts are empty.
errors.AlreadyExistsError If name of the new contexts already used by stored contexts of that ContextType.

put_events

View source

Inserts events in the database.

The execution_id and artifact_id must already exist. Once created, events cannot be modified.

It is not guaranteed that the created or updated events will share the same milliseconds_since_epoch timestamps.

Args
events A list of events to insert.

put_execution

View source

Inserts or updates an Execution with artifacts, events and contexts.

In contrast with other put methods, the method update an execution atomically with its input/output artifacts and events and adds attributions and associations to related contexts.

If an execution_id, artifact_id or context_id is specified, it is an update, otherwise it does an insertion.

It is not guaranteed that the created or updated executions, artifacts, contexts and events will share the same create_time_since_epoch, last_update_time_since_epoch, or milliseconds_since_epoch timestamps.

Args
execution The execution to be created or updated.
artifact_and_events a pair of Artifact and Event that the execution uses or generates. The event's execution id or artifact id can be empty, as the artifact or execution may not be stored beforehand. If given, the ids must match with the paired Artifact and the input execution.
contexts The Contexts that the execution should be associated with and the artifacts should be attributed to.
reuse_context_if_already_exist When there's a race to publish executions with a new context (no id) with the same context.name, by default there will be one writer succeeds and the rest of the writers fail with AlreadyExists errors. If set is to True, failed writers will reuse the stored context.
reuse_artifact_if_already_exist_by_external_id When there's a race to publish executions with a new artifact with the same artifact.external_id, by default there'll be one writer succeeds and the rest of the writers returning AlreadyExists errors. If set to true and an Artifact has non-empty external_id, the API will reuse the stored artifact in the transaction and perform an update. Otherwise, it will fall back to relying on id field to decide if it's update (if id exists) or insert (if id is empty).

Returns
the execution id, the list of artifact's id, and the list of context's id.

Raises
errors.InvalidArgumentError If the id of the input nodes do not align with the store. Please refer to InvalidArgument errors in other put methods.
errors.AlreadyExistsError If the new nodes to be created is already exists. Please refer to AlreadyExists errors in other put methods.

put_execution_type

View source

Inserts or updates an execution type.

A type has a set of strong typed properties describing the schema of any stored instance associated with that type. A type is identified by a name and an optional version.

Type Creation:

If no type exists in the database with the given identifier (name, version), it creates a new type and returns the type_id.

Type Evolution:

If the request type with the same (name, version) already exists (let's call it stored_type), the method enforces the stored_type can be updated only when the request type is backward compatible for the already stored instances.

Backwards compatibility is violated iff:

  1. there is a property where the request type and stored_type have different value type (e.g., int vs. string)
  2. can_add_fields = false and the request type has a new property that is not stored.
  3. can_omit_fields = false and stored_type has an existing property that is not provided in the request type.

If non-backward type change is required in the application, e.g., deprecate properties, re-purpose property name, change value types, a new type can be created with a different (name, version) identifier. Note the type version is optional, and a version value with empty string is treated as unset.

Args
execution_type the request type to be inserted or updated.
can_add_fields when true, new properties can be added; when false, returns ALREADY_EXISTS if the request type has properties that are not in stored_type.
can_omit_fields when true, stored properties can be omitted in the request type; when false, returns ALREADY_EXISTS if the stored_type has properties not in the request type.

Returns
the type_id of the response.

Raises
errors.AlreadyExistsError If the type is not backward compatible.
errors.InvalidArgumentError If the request type has no name, or any property value type is unknown.

put_executions

View source

Inserts or updates executions in the database.

If an execution id is specified for an execution, it is an update. If an execution id is unspecified, it will insert a new execution. For new executions, type must be specified. For old executions, type must be unchanged or unspecified. When the name of an execution is given, it should be unique among executions of the same ExecutionType.

It is not guaranteed that the created or updated executions will share the same create_time_since_epoch or last_update_time_since_epoch timestamps.

If field_mask_paths is specified and non-empty:

  1. while updating an existing execution, it only updates fields specified in field_mask_paths.
  2. while inserting a new execution, field_mask_paths will be ignored.
  3. otherwise, field_mask_paths will be applied to all executions. If field_mask_paths is unspecified or is empty, it updates the execution as a whole.

Args
executions A list of executions to insert or update.
field_mask_paths A list of field mask paths for masked update.

Returns
A list of execution ids index-aligned with the input.

Raises
errors.AlreadyExistsError If execution's name is specified and it is already used by stored executions of that ExecutionType.

put_lineage_subgraph

View source

Inserts a collection of executions, artifacts, contexts, and events.

This method atomically inserts or updates all specified executions, artifacts, and events and adds attributions and associations to related contexts.

It is not guaranteed that the created or updated executions, artifacts, contexts and events will share the same create_time_since_epoch, last_update_time_since_epoch, or milliseconds_since_epoch timestamps.

Args
executions List of executions to be created or updated.
artifacts List of artifacts to be created or updated.
contexts List of contexts to be created or reused. Contexts will be associated with the inserted executions and attributed to the inserted artifacts.
event_edges List of event edges in the subgraph to be inserted. Event edges are defined as an optional execution_index, an optional artifact_index, and a required event. Event edges must have an execution_index and/or an event.execution_id. Execution_index corresponds to an execution in the executions list at the specified index. If both execution_index and event.execution_id are provided, the execution ids of the execution and the event must match. The same rules apply to artifact_index and event.artifact_id.
reuse_context_if_already_exist When there's a race to publish executions with a new context (no id) with the same context.name, by default there will be one writer that succeeds and the rest of the writers will fail with AlreadyExists errors. If set to True, failed writers will reuse the stored context.
reuse_artifact_if_already_exist_by_external_id When there's a race to publish executions with a new artifact with the same artifact.external_id, by default there'll be one writer succeeds and the rest of the writers returning AlreadyExists errors. If set to true and an Artifact has non-empty external_id, the API will reuse the stored artifact in the transaction and perform an update. Otherwise, it will fall back to relying on id field to decide if it's update (if id exists) or insert (if id is empty).

Returns
The lists of execution ids, artifact ids, and context ids index aligned to the input executions, artifacts, and contexts.

Raises
errors.InvalidArgumentError If the id of the input nodes do not align with the store. Please refer to InvalidArgument errors in other put methods.
errors.AlreadyExistsError If the new nodes to be created already exist. Please refer to AlreadyExists errors in other put methods.
errors.OutOfRangeError If event_edge indices do not correspond to existing indices in the input lists of executions and artifacts.

put_parent_contexts

View source

Inserts parent contexts in the database.

The child_id and parent_id in every parent context must already exist.

Args
parent_contexts A list of parent contexts to insert.

Raises
errors.InvalidArgumentError if no context matches the child_id or no context matches the parent_id in any parent context.
errors.AlreadyExistsError if the same parent context already exists.