tft.apply_buckets_with_interpolation
Stay organized with collections
Save and categorize content based on your preferences.
Interpolates within the provided buckets and then normalizes to 0 to 1.
tft.apply_buckets_with_interpolation(
x: common_types.ConsistentTensorType,
bucket_boundaries: common_types.BucketBoundariesType,
name: Optional[str] = None
) -> common_types.ConsistentTensorType
A method for normalizing continuous numeric data to the range [0, 1].
Numeric values are first bucketized according to the provided boundaries, then
linearly interpolated within their respective bucket ranges. Finally, the
interpolated values are normalized to the range [0, 1]. Values that are
less than or equal to the lowest boundary, or greater than or equal to the
highest boundary, will be mapped to 0 and 1 respectively. NaN values will be
mapped to the middle of the range (.5).
This is a non-linear approach to normalization that is less sensitive to
outliers than min-max or z-score scaling. When outliers are present, standard
forms of normalization can leave the majority of the data compressed into a
very small segment of the output range, whereas this approach tends to spread
out the more frequent values (if quantile buckets are used). Note that
distance relationships in the raw data are not necessarily preserved (data
points that close to each other in the raw feature space may not be equally
close in the transformed feature space). This means that unlike linear
normalization methods, correlations between features may be distorted by the
transformation. This scaling method may help with stability and minimize
exploding gradients in neural networks.
Args |
x
|
A numeric input Tensor , SparseTensor , or RaggedTensor
(tf.float[32|64], tf.int[32|64]).
|
bucket_boundaries
|
Sorted bucket boundaries as a rank-2 Tensor or list.
|
name
|
(Optional) A name for this operation.
|
Returns |
A Tensor , SparseTensor , or RaggedTensor of the same shape as x ,
normalized to the range [0, 1]. If the input x is tf.float64, the returned
values will be tf.float64. Otherwise, returned values are tf.float32.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-11-01 UTC.
[null,null,["Last updated 2024-11-01 UTC."],[],[],null,["# tft.apply_buckets_with_interpolation\n\n\u003cbr /\u003e\n\n|---------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/transform/blob/v1.16.0/tensorflow_transform/mappers.py#L1984-L2098) |\n\nInterpolates within the provided buckets and then normalizes to 0 to 1. \n\n tft.apply_buckets_with_interpolation(\n x: common_types.ConsistentTensorType,\n bucket_boundaries: common_types.BucketBoundariesType,\n name: Optional[str] = None\n ) -\u003e common_types.ConsistentTensorType\n\nA method for normalizing continuous numeric data to the range \\[0, 1\\].\nNumeric values are first bucketized according to the provided boundaries, then\nlinearly interpolated within their respective bucket ranges. Finally, the\ninterpolated values are normalized to the range \\[0, 1\\]. Values that are\nless than or equal to the lowest boundary, or greater than or equal to the\nhighest boundary, will be mapped to 0 and 1 respectively. NaN values will be\nmapped to the middle of the range (.5).\n\nThis is a non-linear approach to normalization that is less sensitive to\noutliers than min-max or z-score scaling. When outliers are present, standard\nforms of normalization can leave the majority of the data compressed into a\nvery small segment of the output range, whereas this approach tends to spread\nout the more frequent values (if quantile buckets are used). Note that\ndistance relationships in the raw data are not necessarily preserved (data\npoints that close to each other in the raw feature space may not be equally\nclose in the transformed feature space). This means that unlike linear\nnormalization methods, correlations between features may be distorted by the\ntransformation. This scaling method may help with stability and minimize\nexploding gradients in neural networks.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|---------------------|-----------------------------------------------------------------------------------------------------|\n| `x` | A numeric input `Tensor`, `SparseTensor`, or `RaggedTensor` (tf.float\\[32\\|64\\], tf.int\\[32\\|64\\]). |\n| `bucket_boundaries` | Sorted bucket boundaries as a rank-2 `Tensor` or list. |\n| `name` | (Optional) A name for this operation. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A `Tensor`, `SparseTensor`, or `RaggedTensor` of the same shape as `x`, normalized to the range \\[0, 1\\]. If the input x is tf.float64, the returned values will be tf.float64. Otherwise, returned values are tf.float32. ||\n\n\u003cbr /\u003e"]]