View source on GitHub |
Category encoding layer.
tf.keras.layers.CategoryEncoding(
num_tokens=None, output_mode='multi_hot', sparse=False, **kwargs
)
This layer provides options for condensing data into a categorical encoding
when the total number of tokens are known in advance. It accepts integer
values as inputs, and it outputs a dense representation of those
inputs. For integer inputs where the total number of tokens is not known,
use instead tf.keras.layers.IntegerLookup
.
Examples:
One-hot encoding data
layer = tf.keras.layers.CategoryEncoding(
num_tokens=4, output_mode="one_hot")
layer([3, 2, 0, 1])
<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[0., 0., 0., 1.],
[0., 0., 1., 0.],
[1., 0., 0., 0.],
[0., 1., 0., 0.]], dtype=float32)>
Multi-hot encoding data
layer = tf.keras.layers.CategoryEncoding(
num_tokens=4, output_mode="multi_hot")
layer([[0, 1], [0, 0], [1, 2], [3, 1]])
<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 1., 0., 0.],
[1., 0., 0., 0.],
[0., 1., 1., 0.],
[0., 1., 0., 1.]], dtype=float32)>
Using weighted inputs in "count"
mode
layer = tf.keras.layers.CategoryEncoding(
num_tokens=4, output_mode="count")
count_weights = np.array([[.1, .2], [.1, .1], [.2, .3], [.4, .2]])
layer([[0, 1], [0, 0], [1, 2], [3, 1]], count_weights=count_weights)
<tf.Tensor: shape=(4, 4), dtype=float64, numpy=
array([[0.1, 0.2, 0. , 0. ],
[0.2, 0. , 0. , 0. ],
[0. , 0.2, 0.3, 0. ],
[0. , 0.2, 0. , 0.4]])>
Args | |
---|---|
num_tokens
|
The total number of tokens the layer should support. All inputs
to the layer must integers in the range 0 <= value < num_tokens , or an
error will be thrown.
|
output_mode
|
Specification for the output of the layer.
Defaults to "multi_hot" . Values can be "one_hot" , "multi_hot" or
"count" , configuring the layer as follows:
|
sparse
|
Boolean. If true, returns a SparseTensor instead of a dense
Tensor . Defaults to False .
|
Call arguments:
inputs
: A 1D or 2D tensor of integer inputs.count_weights
: A tensor in the same shape asinputs
indicating the weight for each sample value when summing up incount
mode. Not used in"multi_hot"
or"one_hot"
modes.