Validates a batch of examples against the schema provided in options
.
tfdv.validate_instance(
instance: pa.RecordBatch,
options: tfdv.StatsOptions
,
environment: Optional[str] = None
) -> anomalies_pb2.Anomalies
If an optional environment
is specified, the schema is filtered using the
environment
and the instance
is validated against the filtered schema.
Args |
instance
|
A batch of examples in the form of an Arrow RecordBatch.
|
options
|
tfdv.StatsOptions for generating data statistics. This must
contain a schema.
|
environment
|
An optional string denoting the validation environment. Must be
one of the default environments specified in the schema. In some cases
introducing slight schema variations is necessary, for instance features
used as labels are required during training (and should be validated), but
are missing during serving. Environments can be used to express such
requirements. For example, assume a feature named 'LABEL' is required for
training, but is expected to be missing from serving. This can be
expressed by defining two distinct environments in the schema: ["SERVING",
"TRAINING"] and associating 'LABEL' only with environment "TRAINING".
|
Returns |
An Anomalies protocol buffer.
|
Raises |
ValueError
|
If options is not a StatsOptions object.
|
ValueError
|
If options does not contain a schema.
|