|  View source on GitHub | 
Batch Normalization layer from (Ioffe et al., 2015).
Inherits From: BatchNormalization, Layer, Layer, Module
tf.compat.v1.layers.BatchNormalization(
    axis=-1,
    momentum=0.99,
    epsilon=0.001,
    center=True,
    scale=True,
    beta_initializer=tf.compat.v1.zeros_initializer(),
    gamma_initializer=tf.compat.v1.ones_initializer(),
    moving_mean_initializer=tf.compat.v1.zeros_initializer(),
    moving_variance_initializer=tf.compat.v1.ones_initializer(),
    beta_regularizer=None,
    gamma_regularizer=None,
    beta_constraint=None,
    gamma_constraint=None,
    renorm=False,
    renorm_clipping=None,
    renorm_momentum=0.99,
    fused=None,
    trainable=True,
    virtual_batch_size=None,
    adjustment=None,
    name=None,
    **kwargs
)
Migrate to TF2
This API is not compatible with eager execution or tf.function.
Please refer to tf.layers section of the migration guide
to migrate a TensorFlow v1 model to Keras. The corresponding TensorFlow v2
layer is tf.keras.layers.BatchNormalization.
Structural Mapping to Native TF2
None of the supported arguments have changed name.
Before:
 bn = tf.compat.v1.layers.BatchNormalization()
After:
 bn = tf.keras.layers.BatchNormalization()
How to Map Arguments
| TF1 Arg Name | TF2 Arg Name | Note | 
|---|---|---|
| name | name | Layer base class | 
| trainable | trainable | Layer base class | 
| axis | axis | - | 
| momentum | momentum | - | 
| epsilon | epsilon | - | 
| center | center | - | 
| scale | scale | - | 
| beta_initializer | beta_initializer | - | 
| gamma_initializer | gamma_initializer | - | 
| moving_mean_initializer | moving_mean_initializer | - | 
| beta_regularizer | `beta_regularizer' | - | 
| gamma_regularizer | `gamma_regularizer' | - | 
| beta_constraint | `beta_constraint' | - | 
| gamma_constraint | `gamma_constraint' | - | 
| renorm | Not supported | - | 
| renorm_clipping | Not supported | - | 
| renorm_momentum | Not supported | - | 
| fused | Not supported | - | 
| virtual_batch_size | Not supported | - | 
| adjustment | Not supported | - | 
Description
Keras APIs handle BatchNormalization updates to the moving_mean and
moving_variance as part of their fit() and evaluate() loops. However, if a
custom training loop is used with an instance of Model, these updates need
to be explicitly included.  Here's a simple example of how it can be done:
  # model is an instance of Model that contains BatchNormalization layer.
  update_ops = model.get_updates_for(None) + model.get_updates_for(features)
  train_op = optimizer.minimize(loss)
  train_op = tf.group([train_op, update_ops])
| Args | |
|---|---|
| axis | An intor list ofint, the axis or axes that should be normalized,
typically the features axis/axes. For instance, after aConv2Dlayer
withdata_format="channels_first", setaxis=1. If a list of axes is
provided, each axis inaxiswill be normalized
  simultaneously. Default is-1which uses the last axis. Note: when
    using multi-axis batch norm, thebeta,gamma,moving_mean, andmoving_variancevariables are the same rank as the input Tensor,
    with dimension size 1 in all reduced (non-axis) dimensions). | 
| momentum | Momentum for the moving average. | 
| epsilon | Small float added to variance to avoid dividing by zero. | 
| center | If True, add offset of betato normalized tensor. If False,betais ignored. | 
| scale | If True, multiply by gamma. If False,gammais not used. When the
next layer is linear (also e.g.nn.relu), this can be disabled since the
scaling can be done by the next layer. | 
| beta_initializer | Initializer for the beta weight. | 
| gamma_initializer | Initializer for the gamma weight. | 
| moving_mean_initializer | Initializer for the moving mean. | 
| moving_variance_initializer | Initializer for the moving variance. | 
| beta_regularizer | Optional regularizer for the beta weight. | 
| gamma_regularizer | Optional regularizer for the gamma weight. | 
| beta_constraint | An optional projection function to be applied to the betaweight after being updated by anOptimizer(e.g. used to implement norm
constraints or value constraints for layer weights). The function must
take as input the unprojected variable and must return the projected
variable (which must have the same shape). Constraints are not safe to use
when doing asynchronous distributed training. | 
| gamma_constraint | An optional projection function to be applied to the gammaweight after being updated by anOptimizer. | 
| renorm | Whether to use Batch Renormalization (Ioffe, 2017). This adds extra variables during training. The inference is the same for either value of this parameter. | 
| renorm_clipping | A dictionary that may map keys 'rmax', 'rmin', 'dmax' to
scalar Tensorsused to clip the renorm correction. The correction(r,
d)is used ascorrected_value = normalized_value * r + d, withrclipped to [rmin, rmax], anddto [-dmax, dmax]. Missing rmax, rmin,
dmax are set to inf, 0, inf, respectively. | 
| renorm_momentum | Momentum used to update the moving means and standard
deviations with renorm. Unlike momentum, this affects training and
should be neither too small (which would add noise) nor too large (which
would give stale estimates). Note thatmomentumis still applied to get
the means and variances for inference. | 
| fused | if NoneorTrue, use a faster, fused implementation if possible.
IfFalse, use the system recommended implementation. | 
| trainable | Boolean, if Truealso add variables to the graph collectionGraphKeys.TRAINABLE_VARIABLES(see tf.Variable). | 
| virtual_batch_size | An int. By default,virtual_batch_sizeisNone,
which means batch normalization is performed across the whole batch. Whenvirtual_batch_sizeis notNone, instead perform "Ghost Batch
Normalization", which creates virtual sub-batches which are each
normalized separately (with shared gamma, beta, and moving statistics).
Must divide the actual batch size during execution. | 
| adjustment | A function taking the Tensorcontaining the (dynamic) shape of
the input tensor and returning a pair (scale, bias) to apply to the
normalized values (before gamma and beta), only during training. For
example, if axis==-1,adjustment = lambda shape: (
    tf.random.uniform(shape[-1:], 0.93, 1.07),
    tf.random.uniform(shape[-1:], -0.1, 0.1))will scale the normalized
      value by up to 7% up or down, then shift the result by up to 0.1
      (with independent scaling and bias for each feature but shared
      across all examples), and finally apply gamma and/or beta. IfNone, no adjustment is applied. Cannot be specified if
      virtual_batch_size is specified. | 
| name | A string, the name of the layer. | 
| References | |
|---|---|
| Batch Normalization - Accelerating Deep Network Training by Reducing Internal Covariate Shift: Ioffe et al., 2015 (pdf) Batch Renormalization - Towards Reducing Minibatch Dependence in Batch-Normalized Models: Ioffe, 2017 (pdf) | 
| Attributes | |
|---|---|
| graph | |
| scope_name | |