Batch Normalization layer from (Ioffe et al., 2015).

Inherits From: BatchNormalization, Layer, Layer, Module

Migrate to TF2

This API is not compatible with eager execution or tf.function.

Please refer to tf.layers section of the migration guide to migrate a TensorFlow v1 model to Keras. The corresponding TensorFlow v2 layer is tf.keras.layers.BatchNormalization.

Structural Mapping to Native TF2

None of the supported arguments have changed name.


 bn = tf.compat.v1.layers.BatchNormalization()


 bn = tf.keras.layers.BatchNormalization()

How to Map Arguments

TF1 Arg Name TF2 Arg Name Note
name name Layer base class
trainable trainable Layer base class
axis axis -
momentum momentum -
epsilon epsilon -
center center -
scale scale -
beta_initializer beta_initializer -
gamma_initializer gamma_initializer -
moving_mean_initializer moving_mean_initializer -
beta_regularizer `beta_regularizer' -
gamma_regularizer `gamma_regularizer' -
beta_constraint `beta_constraint' -
gamma_constraint `gamma_constraint' -
renorm Not supported -
renorm_clipping Not supported -
renorm_momentum Not supported -
fused Not supported -
virtual_batch_size Not supported -
adjustment Not supported -


Keras APIs handle BatchNormalization updates to the moving_mean and moving_variance as part of their fit() and evaluate() loops. However, if a custom training loop is used with an instance of Model, these updates need to be explicitly included. Here's a simple example of how it can be done:

  # model is an instance of Model that contains BatchNormalization layer.
  update_ops = model.get_updates_for(None) + model.get_updates_for(features)
  train_op = optimizer.minimize(loss)
  train_op =[train_op, update_ops])

axis An int or list of int, the axis or axes that should be normalized, typically the features axis/axes. For instance, after a Conv2D layer with data_format="channels_first", set axis=1. If a list of axes is provided, each axis in axis will be normalized simultaneously. Default is -1 which uses the last axis. Note: when using multi-axis batch norm, the beta, gamma, moving_mean, and moving_variance variables are the same rank as the input Tensor, with dimension size 1 in all reduced (non-axis) dimensions).
momentum Momentum for the moving average.
epsilon Small float added to variance to avoid dividing by zero.
center If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. nn.relu), this can be disabled since the scaling can be done by the next layer.
beta_initializer Initializer for the beta weight.
gamma_initializer Initializer for the gamma weight.
moving_mean_initializer Initializer for the moving mean