View source on GitHub
  
 | 
Compute data statistics for the input pandas DataFrame.
tfdv.generate_statistics_from_dataframe(
    dataframe: DataFrame,
    stats_options: tfdv.StatsOptions = options.StatsOptions(),
    n_jobs: int = 1
) -> statistics_pb2.DatasetFeatureStatisticsList
This is a utility function for users with in-memory data represented as a pandas DataFrame.
This function supports only DataFrames with columns of primitive string or numeric types. DataFrames with multivalent features or holding non-string object types are not supported.
Args | |
|---|---|
dataframe
 | 
Input pandas DataFrame. | 
stats_options
 | 
tfdv.StatsOptions for generating data statistics.
 | 
n_jobs
 | 
Number of processes to run (defaults to 1). If -1 is provided, uses the same number of processes as the number of CPU cores. | 
Returns | |
|---|---|
| A DatasetFeatureStatisticsList proto. | 
    View source on GitHub