|  View source on GitHub | 
Buckets data into discrete ranges.
Inherits From: PreprocessingLayer, Layer, Module
tf.keras.layers.Discretization(
    bin_boundaries=None, num_bins=None, epsilon=0.01, **kwargs
)
This layer will place each element of its input data into one of several contiguous ranges and output an integer index indicating which range each element was placed in.
Input shape:
Any tf.Tensor or tf.RaggedTensor of dimension 2 or higher.
Output shape:
Same as input shape.
Examples:
Bucketize float values based on provided buckets.
>>> input = np.array([[-1.5, 1.0, 3.4, .5], [0.0, 3.0, 1.3, 0.0]])
>>> layer = tf.keras.layers.Discretization(bin_boundaries=[0., 1., 2.])
>>> layer(input)
<tf.Tensor: shape=(2, 4), dtype=int64, numpy=
array([[0, 2, 3, 1],
       [1, 3, 2, 1]], dtype=int64)>
Bucketize float values based on a number of buckets to compute.
>>> input = np.array([[-1.5, 1.0, 3.4, .5], [0.0, 3.0, 1.3, 0.0]])
>>> layer = tf.keras.layers.Discretization(num_bins=4, epsilon=0.01)
>>> layer.adapt(input)
>>> layer(input)
<tf.Tensor: shape=(2, 4), dtype=int64, numpy=
array([[0, 2, 3, 2],
       [1, 3, 3, 1]], dtype=int64)>
| Attributes | |
|---|---|
| bin_boundaries | A list of bin boundaries. The leftmost and rightmost bins
will always extend to -infandinf, sobin_boundaries=[0., 1., 2.]generates bins(-inf, 0.),[0., 1.),[1., 2.), and[2., +inf). If
this option is set,adaptshould not be called. | 
| num_bins | The integer number of bins to compute. If this option is set, adaptshould be called to learn the bin boundaries. | 
| epsilon | Error tolerance, typically a small fraction close to zero (e.g. 0.01). Higher values of epsilon increase the quantile approximation, and hence result in more unequal buckets, but could improve performance and resource consumption. | 
| is_adapted | Whether the layer has been fit to data already. | 
Methods
adapt
adapt(
    data, batch_size=None, steps=None
)
Fits the state of the preprocessing layer to the data being passed.
After calling adapt on a layer, a preprocessing layer's state will not
update during training. In order to make preprocessing layers efficient in
any distribution context, they are kept constant with respect to any
compiled tf.Graphs that call the layer. This does not affect the layer use
when adapting each layer only once, but if you adapt a layer multiple times
you will need to take care to re-compile any compiled functions as follows:
- If you are adding a preprocessing layer to a keras.Model, you need to callmodel.compileafter each subsequent call toadapt.
- If you are calling a preprocessing layer inside tf.data.Dataset.map, you should callmapagain on the inputtf.data.Datasetafter eachadapt.
- If you are using a tf.functiondirectly which calls a preprocessing layer, you need to calltf.functionagain on your callable after each subsequent call toadapt.
tf.keras.Model example with multiple adapts:
layer = tf.keras.layers.experimental.preprocessing.Normalization(axis=None)layer.adapt([0, 2])model = tf.keras.Sequential(layer)model.predict([0, 1, 2])array([-1., 0., 1.], dtype=float32)layer.adapt([-1, 1])model.compile() # This is needed to re-compile model.predict!model.predict([0, 1, 2])array([0., 1., 2.], dtype=float32)
tf.data.Dataset example with multiple adapts:
layer = tf.keras.layers.experimental.preprocessing.Normalization(axis=None)layer.adapt([0, 2])input_ds = tf.data.Dataset.range(3)normalized_ds = input_ds.map(layer)list(normalized_ds.as_numpy_iterator())[array([-1.], dtype=float32),array([0.], dtype=float32),array([1.], dtype=float32)]layer.adapt([-1, 1])normalized_ds = input_ds.map(layer) # Re-map over the input dataset.list(normalized_ds.as_numpy_iterator())[array([0.], dtype=float32),array([1.], dtype=float32),array([2.], dtype=float32)]
| Arguments | |
|---|---|
| data | The data to train on. It can be passed either as a tf.data Dataset, or as a numpy array. | 
| batch_size | Integer or None.
Number of samples per state update.
If unspecified,batch_sizewill default to 32.
Do not specify thebatch_sizeif your data is in the
form of datasets, generators, orkeras.utils.Sequenceinstances
(since they generate batches). | 
| steps | Integer or None.
Total number of steps (batches of samples)
When training with input tensors such as
TensorFlow data tensors, the defaultNoneis equal to
the number of samples in your dataset divided by
the batch size, or 1 if that cannot be determined. If x is atf.datadataset, and 'steps' is None, the epoch will run until
the input dataset is exhausted. When passing an infinitely
repeating dataset, you must specify thestepsargument. This
argument is not supported with array inputs. | 
compile
compile(
    run_eagerly=None, steps_per_execution=None
)
Configures the layer for adapt.
| Arguments | |
|---|---|
| run_eagerly | Bool. Defaults to False. IfTrue, thisModel's logic
will not be wrapped in atf.function. Recommended to leave this asNoneunless yourModelcannot be run inside atf.function.
steps_per_execution: Int. Defaults to 1. The number of batches to run
  during eachtf.functioncall. Running multiple batches inside a
  singletf.functioncall can greatly improve performance on TPUs or
  small models with a large Python overhead. | 
reset_state
reset_state()
Resets the statistics of the preprocessing layer.
update_state
update_state(
    data
)
Accumulates statistics for the preprocessing layer.
| Arguments | |
|---|---|
| data | A mini-batch of inputs to the layer. |