ML Community Day is November 9! Join us for updates from TensorFlow, JAX, and more Learn more

tf.keras.layers.CategoryEncoding

Category encoding layer.

Inherits From: Layer, Module

Used in the notebooks

Used in the guide Used in the tutorials

This layer provides options for condensing data into a categorical encoding when the total number of tokens are known in advance. It accepts integer values as inputs, and it outputs a dense representation of those inputs. For integer inputs where the total number of tokens is not known, use instead tf.keras.layers.IntegerLookup.

Examples:

One-hot encoding data

layer = tf.keras.layers.CategoryEncoding(
          num_tokens=4, output_mode="one_hot")
layer([3, 2, 0, 1])
<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
  array([[0., 0., 0., 1.],
         [0., 0., 1., 0.],
         [1., 0., 0., 0.],
         [0., 1., 0., 0.]], dtype=float32)>

Multi-hot encoding data

layer = tf.keras.layers.CategoryEncoding(
          num_tokens=4, output_mode="multi_hot")
layer([[0, 1], [0, 0], [1, 2], [3, 1]])
<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
  array([[1., 1., 0., 0.],
         [1., 0., 0., 0.],
         [0., 1., 1., 0.],
         [0., 1., 0., 1.]], dtype=float32)>

Using weighted inputs in "count" mode

layer = tf.keras.layers.CategoryEncoding(
          num_tokens=4, output_mode="count")
count_weights = np.array([[.1, .2], [.1, .1], [.2, .3], [.4, .2]])
layer([[0, 1], [0, 0], [1, 2], [3, 1]], count_weights=count_weights)
<tf.Tensor: shape=(4, 4), dtype=float64, numpy=
  array([[0.1, 0.2, 0. , 0. ],
         [0.2, 0. , 0. , 0. ],
         [0. , 0.2, 0.3, 0. ],
         [0. , 0.2, 0. , 0.4]])>

num_tokens The total number of tokens the layer should support. All inputs to the layer must integers in the range 0 <= value < num_tokens, or an error will be thrown.
output_mode Specification for the output of the layer. Defaults to "multi_hot". Values can be "one_hot", "multi_hot" or