Returns a bucketized column, with a bucket index assigned to each input.
tft.bucketize(
x: common_types.ConsistentTensorType,
num_buckets: int,
epsilon: Optional[float] = None,
weights: Optional[tf.Tensor] = None,
elementwise: bool = False,
name: Optional[str] = None
) -> common_types.ConsistentTensorType
Used in the notebooks
Args |
x
|
A numeric input Tensor , SparseTensor , or RaggedTensor whose values
should be mapped to buckets. For a CompositeTensor only non-missing
values will be included in the quantiles computation, and the result of
bucketize will be a CompositeTensor with non-missing values mapped to
buckets. If elementwise=True then x must be dense.
|
num_buckets
|
Values in the input x are divided into approximately
equal-sized buckets, where the number of buckets is num_buckets .
|
epsilon
|
(Optional) Error tolerance, typically a small fraction close to
zero. If a value is not specified by the caller, a suitable value is
computed based on experimental results. For num_buckets less than 100,
the value of 0.01 is chosen to handle a dataset of up to ~1 trillion input
data values. If num_buckets is larger, then epsilon is set to
(1/num_buckets ) to enforce a stricter error tolerance, because more
buckets will result in smaller range for each bucket, and so we want the
boundaries to be less fuzzy. See analyzers.quantiles() for details.
|
weights
|
(Optional) Weights tensor for the quantiles. Tensor must have the
same shape as x.
|
elementwise
|
(Optional) If true, bucketize each element of the tensor
independently.
|
name
|
(Optional) A name for this operation.
|
Returns |
A Tensor of the same shape as x , with each element in the
returned tensor representing the bucketized value. Bucketized value is
in the range [0, actual_num_buckets). Sometimes the actual number of buckets
can be different than num_buckets hint, for example in case the number of
distinct values is smaller than num_buckets, or in cases where the
input values are not uniformly distributed.
NaN values are mapped to the last bucket. Values with NaN weights are
ignored in bucket boundaries calculation.
|
Raises |
TypeError
|
If num_buckets is not an int.
|
ValueError
|
If value of num_buckets is not > 1.
|
ValueError
|
If elementwise=True and x is a CompositeTensor .
|