Creates a TFXIO instance that reads file_pattern
.
tfx.components.util.tfxio_utils.make_tfxio(
file_pattern: tfx.components.util.tfxio_utils.OneOrMorePatterns
,
telemetry_descriptors: List[Text],
payload_format: Union[Text, int],
data_view_uri: Optional[Text] = None,
schema: Optional[schema_pb2.Schema] = None,
read_as_raw_records: bool = False,
raw_record_column_name: Optional[Text] = None
) -> tfxio.TFXIO
Args |
file_pattern
|
the file pattern for the TFXIO to access.
|
telemetry_descriptors
|
A set of descriptors that identify the component
that is instantiating the TFXIO. These will be used to construct the
namespace to contain metrics for profiling and are therefore expected to
be identifiers of the component itself and not individual instances of
source use.
|
payload_format
|
one of the enums from example_gen_pb2.PayloadFormat (may
be in string or int form). If None, default to FORMAT_TF_EXAMPLE.
|
data_view_uri
|
uri to a DataView artifact. A DataView is needed in order
to create a TFXIO for certain payload formats.
|
schema
|
TFMD schema. Note: although optional, some payload formats need a
schema in order for all TFXIO interfaces (e.g. TensorAdapter()) to work.
Unless you know what you are doing, always supply a schema.
|
read_as_raw_records
|
If True, ignore the payload type of examples . Always
use RawTfRecord TFXIO.
|
raw_record_column_name
|
If provided, the arrow RecordBatch produced by
the TFXIO will contain a string column of the given name, and the contents
of that column will be the raw records. Note that not all TFXIO supports
this option, and an error will be raised in that case. Required if
read_as_raw_records == True.
|
Returns |
a TFXIO instance.
|