tf.feature_column.bucketized_column

TensorFlow 1 version View source on GitHub

Represents discretized dense input bucketed by boundaries.

Buckets include the left boundary, and exclude the right boundary. Namely, boundaries=[0., 1., 2.] generates buckets (-inf, 0.), [0., 1.), [1., 2.), and [2., +inf).

For example, if the inputs are

boundaries = [0, 10, 100]
input tensor = [[-5, 10000]
                [150,   10]
                [5,    100]]

then the output will be

output = [[0, 3]
          [3, 2]
          [1, 3]]

Example:

price = tf.feature_column.numeric_column('price')
bucketized_price = tf.feature_column.bucketized_column(
    price, boundaries=[...])
columns = [bucketized_price, ...]
features = tf.io.parse_example(
    ..., features=tf.feature_column.make_parse_example_spec(columns))
dense_tensor = tf.keras.layers.DenseFeatures(columns)(features)

bucketized_column can also be crossed with another categorical column using crossed_column:

price = tf.feature_column.numeric_column('price')
# bucketized_column converts numerical feature to a categorical one.
bucketized_price = tf.feature_column.bucketized_column(
    price, boundaries=[...])
# 'keywords' is a string feature.
price_x_keywords = tf.feature_column.crossed_column(
    [bucketized_price, 'keywords'], 50K)
columns = [price_x_keywords, ...]
features = tf.io.parse_example(
    ..., features=tf.feature_column.make_parse_example_spec(columns))
dense_tensor = tf.keras.layers.DenseFeatures(columns)(features)
linear_model = tf.keras.experimental.LinearModel(units=...)(dense_tensor)

source_column A one-dimensional dense column which is generated with numeric_column.
boundaries A sorted list or tuple of floats specifying the boundaries.

A BucketizedColumn.

ValueError If source_column is not a numeric column, or if it is not one-dimensional.
ValueError If boundaries is not a sorted list or tuple.