tfx.v1.types.standard_artifacts.Examples

Artifact that contains the training data.

Inherits From: Artifact

Training data should be brought in to the TFX pipeline using components like ExampleGen. Data in Examples artifact is split and stored separately. The file and payload format must be specified as optional custom properties if not using default formats. Please see https://www.tensorflow.org/tfx/guide/examplegen#span_version_and_split to understand about span, version and splits.

  • Properties:

    • span: Integer to distinguish group of Examples.
    • version: Integer to represent updated data.
    • split_names: JSON string of the list of split names. For example, '["train", "test"]'. Empty string means artifact has no split.
  • File structure:

    • {uri}/
      • Split-{split_name1}/: Files for split
        • All direct children files are recognized as the data.
        • File format and payload format are determined by custom properties.
      • Split-{split_name2}/: Another split...
  • Commonly used custom properties of the Examples artifact:

    • file_format: a string that represents the file format. See tfx/components/util/tfxio_utils.py:make_tfxio for available values.
    • payload_format: int (enum) value of the data payload format. See tfx/proto/example_gen.proto:PayloadFormat for available formats.

Child Classes

class TYPE_ANNOTATION

PROPERTIES

{
 'span': PropertyType.INT,
 'split_names': PropertyType.STRING,
 'version': PropertyType.INT
}

TYPE_NAME 'Examples'