tf.keras.layers.Layer

This is the class from which all layers inherit.

Inherits From: Module

Used in the notebooks

Used in the guide Used in the tutorials

A layer is a callable object that takes as input one or more tensors and that outputs one or more tensors. It involves computation, defined in the call() method, and a state (weight variables), defined either in the constructor __init__() or in the build() method.

Users will just instantiate a layer and then treat it as a callable.

trainable Boolean, whether the layer's variables should be trainable.
name String name of the layer.
dtype The dtype of the layer's computations and weights. Can also be a tf.keras.mixed_precision.Policy, which allows the computation and weight dtype to differ. Default of None means to use tf.keras.mixed_precision.global_policy(), which is a float32 policy unless set to different value.
dynamic Set this to True if your layer should only be run eagerly, and should not be used to generate a static computation graph. This would be the case for a Tree-RNN or a recursive network, for example, or generally for any layer that manipulates tensors using Python control flow. If False, we assume that the layer can safely be used to generate a static computation graph.

We recommend that descendants of Layer implement the following methods:

  • __init__(): Defines custom layer attributes, and creates layer state variables that do not depend on input shapes, using add_weight().
  • build(self, input_shape): This method can be used to create weights that depend on the shape(s) of the input(s), using add_weight(). __call__() will automatically build the layer (if it has not been built yet) by calling build().
  • call(self, *args, **kwargs): Called in __call__ after making sure build() has been called. call() performs the logic of applying the layer to the input tensors (which should be passed in as argument). Two reserved keyword arguments you can optionally use in call() are:
    • training (boolean, whether the call is in inference mode or training mode)
    • mask (boolean tensor encoding masked timesteps in the input, used in RNN layers)
  • get_config(self): Returns a dictionary containing the configuration used to initialize this layer. If the keys differ from the arguments in __init__, then override from_config(self) as well. This method is used when saving the layer or a model that contains this layer.

Examples:

Here's a basic example: a layer with two variables, w and b, that returns y = w . x + b. It shows how to implement build() and call(). Variables set as attributes of a layer are tracked as weights of the layers (in layer.weights).

class SimpleDense(Layer):

  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),
        trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(
        initial_value=b_init(shape=(self.units,), dtype='float32'),
        trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      return tf.matmul(inputs, self.w) + self.b

# Instantiates the layer.
linear_layer = SimpleDense(4)

# This will also call `build(input_shape)` and create the weights.
y = linear_layer(tf.ones((2, 2)))
assert len(linear_layer.weights) == 2

# These weights are trainable, so they're listed in `trainable_weights`:
assert len(linear_layer.trainable_weights) == 2

Note that the method add_weight() offers a shortcut to create weights:

class SimpleDense(Layer):

  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):
      self.w = self.add_weight(shape=(input_shape[-1], self.units),
                               initializer='random_normal',
                               trainable=True)
      self.b = self.add_weight(shape=(self.units,),
                               initializer='random_normal',
                               trainable=True)

  def call(self, inputs):
      return tf.matmul(inputs, self.w) + self.b

Besides trainable weights, updated via backpropagation during training, layers can also have non-trainable weights. These weights are meant to be updated manually during call(). Here's a example layer that computes the running sum of its inputs:

class ComputeSum(Layer):

  def __init__(self, input_dim):
      super(ComputeSum, self).__init__()
      # Create a non-trainable weight.
      self.total = tf.Variable(initial_value=tf.zeros((input_dim,)),
                               trainable=False)

  def call(self, inputs):
      self.total.assign_add(tf.reduce_sum(inputs, axis=0))
      return self.total

my_sum = ComputeSum(2)
x = tf.ones((2, 2))

y = my_sum(x)
print(y.numpy())  # [2. 2.]

y = my_sum(x)
print(y.numpy())  # [4. 4.]

assert my_sum.weights == [my_sum.total]
assert my_sum.non_trainable_weights == [my_sum.total]
assert my_sum.trainable_weights == []

For more information about creating layers, see the guide Writing custom layers and models with Keras

name The name of the layer (string).
dtype The dtype of the layer's weights.
variable_dtype Alias of dtype.
compute_dtype The dtype of the layer's computations. Layers automatically cast inputs to this dtype which causes the computations and output to also be in this dtype. When mixed precision is used with a tf.keras.mixed_precision.Policy, this will be different than variable_dtype.
dtype_policy The layer's dtype policy. See the tf.keras.mixed_precision.Policy documentation for details.
trainable_weights List of variables to be included in backprop.
non_trainable_weights List of variables that should not be included in backprop.
weights The concatenation of the lists trainable_weights and non_trainable_weights (in this order).
trainable Whether the layer should be trained (boolean), i.e. whether its potentially-trainable weights should be returned as part of layer.trainable_weights.
input_spec Optional (list of) InputSpec object(s) specifying the constraints on inputs that can be accepted by the layer.
activity_regularizer Optional regularizer function for the output of this layer.
dynamic Whether the layer is dynamic (eager-only); set in the constructor.
input Retrieves the input tensor(s) of a layer.

Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer.

losses List of losses added using the add_loss() API.

Variable regularization tensors are created when this property is accessed, so it is eager safe: accessing losses under a tf.GradientTape will propagate gradients back to the corresponding variables.

class MyLayer(tf.keras.layers.Layer):
  def call(self, inputs):
    self.add_loss(tf.abs(tf.reduce_mean(inputs)))
    return inputs
l = MyLayer()
l(np.ones((10, 1)))
l.losses
[1.0]
inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Activity regularization.
len(model.losses)
0
model.add_loss(tf.abs(tf.reduce_mean(x)))
len(model.losses)
1
inputs = tf.keras.Input(shape=(10,))
d = tf.keras.layers.Dense(10, kernel_initializer='ones')
x = d(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Weight regularization.
model.add_loss(lambda: tf.reduce_mean(d.kernel))
model.losses
[<tf.Tensor: shape=(), dtype=float32, numpy=1.0>]

metrics List of metrics added using the add_metric() API.

input = tf.keras.layers.Input(shape=(3,))
d = tf.keras.layers.Dense(2)
output = d(input)
d.add_metric(tf.reduce_max(output), name='max')
d.add_metric(tf.reduce_min(output), name='min')
[m.name for m in d.metrics]
['max', 'min']

output Retrieves the output tensor(s) of a layer.

Only applicable if the layer has exactly one output, i.e. if it is connected to one incoming layer.

supports_masking Whether this layer supports computing a mask using compute_mask.

Methods

add_loss

View source

Add loss tensor(s), potentially dependent on layer inputs.

Some losses (for instance, activity regularization losses) may be dependent on the inputs passed when calling a layer. Hence, when reusing the same layer on different inputs a and b, some entries in layer.losses may be dependent on a and some on b. This method automatically keeps track of dependencies.

This method can be used inside a subclassed layer or model's call function, in which case losses should be a Tensor or list of Tensors.

Example:

class MyLayer(tf.keras.layers.Layer):
  def call(self, inputs):
    self.add_loss(tf.abs(tf.reduce_mean(inputs)))
    return inputs

This method can also be called directly on a Functional Model during construction. In this case, any loss Tensors passed to this Model must be symbolic and be able to be traced back to the model's Inputs. These losses become part of the model's topology and are tracked in get_config.

Example:

inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Activity regularization.
model.add_loss(tf.abs(tf.reduce_mean(x)))

If this is not the case for your loss (if, for example, your loss references a Variable of one of the model's layers), you can wrap your loss in a zero-argument lambda. These losses are not tracked as part of the model's topology since they can't be serialized.

Example:

inputs = tf.keras.Input(shape=(10,))
d = tf.keras.layers.Dense(10)
x = d(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Weight regularization.
model.add_loss(lambda: tf.reduce_mean(d.kernel))

Arguments
losses Loss tensor, or list/tuple of tensors. Rather than tensors, losses may also be zero-argument callables which create a loss tensor.
**kwargs Additional keyword arguments for backward compatibility. Accepted values: inputs - Deprecated, will be automatically inferred.

add_metric

View source

Adds metric tensor to the layer.

This method can be used inside the call() method of a subclassed layer or model.

class MyMetricLayer(tf.keras.layers.Layer):
  def __init__(self):
    super(MyMetricLayer, self).__init__(name='my_metric_layer')
    self.mean = tf.keras.metrics.Mean(name='metric_1')

  def call(self, inputs):
    self.add_metric(self.mean(x))
    self.add_metric(tf.reduce_sum(x), name='metric_2')
    return inputs

This method can also be called directly on a Functional Model during construction. In this case, any tensor passed to this Model must be symbolic and be able to be traced back to the model's Inputs. These metrics become part of the model's topology and are tracked when you save the model via save().

inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
model.add_metric(math_ops.reduce_sum(x), name='metric_1')
inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
model.add_metric(tf.keras.metrics.Mean()(x), name='metric_1')

Args
value Metric tensor.
name String metric name.
**kwargs Additional keyword arguments for backward compatibility. Accepted values: aggregation - When the value tensor provided is not the result of calling a keras.Metric instance, it will be aggregated by default using a keras.Metric.Mean.

add_weight

View source

Adds a new variable to the layer.

Arguments
name Variable name.
shape Variable shape. Defaults to scalar if unspecified.
dtype The type of the variable. Defaults to self.dtype.
initializer Initializer instance (callable).
regularizer Regularizer instance (callable).
trainable Boolean, whether the variable should be part of the layer's "trainable_variables" (e.g. variables, biases) or "non_trainable_variables" (e.g. BatchNorm mean and variance). Note that trainable cannot be True if synchronization is set to ON_READ.
constraint Constraint instance (callable).
use_resource Whether to use ResourceVariable.
synchronization Indicates when a distributed a variable will be aggregated. Accepted values are constants defined in the class tf.VariableSynchronization. By default the synchronization is set to AUTO and the current DistributionStrategy chooses when to synchronize. If synchronization is set to ON_READ, trainable must not be set to True.
aggregation Indicates how a distributed variable will be aggregated. Accepted values are constants defined in the class tf.VariableAggregation.
**kwargs Additional keyword arguments. Accepted values are getter, collections, experimental_autocast and caching_device.

Returns
The variable created.

Raises
ValueError When giving unsupported dtype and no initializer or when trainable has been set to True with synchronization set as ON_READ.

build

View source

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Arguments
input_shape Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call

View source

This is where the layer's logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Arguments
inputs Input tensor, or list/tuple of input tensors.
**kwargs Additional keyword arguments. Currently unused.

Returns
A tensor or list/tuple of tensors.

compute_mask

View source

Computes an output mask tensor.

Arguments
inputs Tensor or list of tensors.
mask Tensor or list of tensors.

Returns
None or a tensor (or list of tensors, one per output tensor of the layer).

compute_output_shape

View source

Computes the output shape of the layer.

If the layer has not been built, this method will call build on the layer. This assumes that the layer will later be used with inputs that match the input shape provided here.

Arguments
input_shape Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.

Returns
An input shape tuple.

compute_output_signature

View source

Compute the output tensor signature of the layer based on the inputs.

Unlike a TensorShape object, a TensorSpec object contains both shape and dtype information for a tensor. This method allows layers to provide output dtype information if it is different from the input dtype. For any layer that doesn't implement this function, the framework will fall back to use compute_output_shape, and will assume that the output dtype matches the input dtype.

Args