|  View source on GitHub | 
Batch Normalization layer from (Ioffe et al., 2015).
Inherits From: BatchNormalization, Layer, Layer, Module
tf.compat.v1.layers.BatchNormalization(
    axis=-1,
    momentum=0.99,
    epsilon=0.001,
    center=True,
    scale=True,
    beta_initializer=tf.compat.v1.zeros_initializer(),
    gamma_initializer=tf.compat.v1.ones_initializer(),
    moving_mean_initializer=tf.compat.v1.zeros_initializer(),
    moving_variance_initializer=tf.compat.v1.ones_initializer(),
    beta_regularizer=None,
    gamma_regularizer=None,
    beta_constraint=None,
    gamma_constraint=None,
    renorm=False,
    renorm_clipping=None,
    renorm_momentum=0.99,
    fused=None,
    trainable=True,
    virtual_batch_size=None,
    adjustment=None,
    name=None,
    **kwargs
)
Migrate to TF2
This API is a legacy api that is only compatible with eager execution and
tf.function if you combine it with
tf.compat.v1.keras.utils.track_tf1_style_variables
Please refer to tf.layers model mapping section of the migration guide to learn how to use your TensorFlow v1 model in TF2 with Keras.
The corresponding TensorFlow v2 layer is
tf.keras.layers.BatchNormalization.
Structural Mapping to Native TF2
None of the supported arguments have changed name.
Before:
 bn = tf.compat.v1.layers.BatchNormalization()
After:
 bn = tf.keras.layers.BatchNormalization()
How to Map Arguments
| TF1 Arg Name | TF2 Arg Name | Note | 
|---|---|---|
| name | name | Layer base class | 
| trainable | trainable | Layer base class | 
| axis | axis | - | 
| momentum | momentum | - | 
| epsilon | epsilon | - | 
| center | center | - | 
| scale | scale | - | 
| beta_initializer | beta_initializer | - | 
| gamma_initializer | gamma_initializer | - | 
| moving_mean_initializer | moving_mean_initializer | - | 
| beta_regularizer | `beta_regularizer' | - | 
| gamma_regularizer | `gamma_regularizer' | - | 
| beta_constraint | `beta_constraint' | - | 
| gamma_constraint | `gamma_constraint' | - | 
| renorm | Not supported | - | 
| renorm_clipping | Not supported | - | 
| renorm_momentum | Not supported | - | 
| fused | Not supported | - | 
| virtual_batch_size | Not supported | - | 
| adjustment | Not supported | - | 
Description
Keras APIs handle BatchNormalization updates to the moving_mean and
moving_variance as part of their fit() and evaluate() loops. However, if
a custom training loop is used with an instance of Model, these updates
need to be explicitly included.  Here's a simple example of how it can be
done:
  # model is an instance of Model that contains BatchNormalization layer.
  update_ops = model.get_updates_for(None) + model.get_updates_for(features)
  train_op = optimizer.minimize(loss)
  train_op = tf.group([train_op, update_ops])
| Args | |
|---|---|
| axis | An intor list ofint, the axis or axes that should be
normalized, typically the features axis/axes. For instance, after aConv2Dlayer withdata_format="channels_first", setaxis=1. If a
list of axes is provided, each axis inaxiswill be normalized
simultaneously. Default is-1which uses the last axis. Note: when
using multi-axis batch norm, thebeta,gamma,moving_mean, andmoving_variancevariables are the same rank as the input Tensor, with
dimension size 1 in all reduced (non-axis) dimensions). | 
| momentum | Momentum for the moving average. | 
| epsilon | Small float added to variance to avoid dividing by zero. | 
| center | If True, add offset of betato normalized tensor. If False,betais ignored. | 
| scale | If True, multiply by gamma. If False,gammais not used. When
the next layer is linear (also e.g.nn.relu), this can be disabled
since the scaling can be done by the next layer. | 
| beta_initializer | Initializer for the beta weight. | 
| gamma_initializer | Initializer for the gamma weight. | 
| moving_mean_initializer | Initializer for the moving mean. | 
| moving_variance_initializer | Initializer for the moving variance. | 
| beta_regularizer | Optional regularizer for the beta weight. | 
| gamma_regularizer | Optional regularizer for the gamma weight. | 
| beta_constraint | An optional projection function to be applied to the betaweight after being updated by anOptimizer(e.g. used to
implement norm constraints or value constraints for layer weights). The
function must take as input the unprojected variable and must return the
projected variable (which must have the same shape). Constraints are not
safe to use when doing asynchronous distributed training. | 
| gamma_constraint | An optional projection function to be applied to the gammaweight after being updated by anOptimizer. | 
| renorm | Whether to use Batch Renormalization (Ioffe, 2017). This adds extra variables during training. The inference is the same for either value of this parameter. | 
| renorm_clipping | A dictionary that may map keys 'rmax', 'rmin', 'dmax' to
scalar Tensorsused to clip the renorm correction. The correction(r,
d)is used ascorrected_value = normalized_value * r + d, withrclipped to [rmin, rmax], anddto [-dmax, dmax]. Missing rmax, rmin,
dmax are set to inf, 0, inf, respectively. | 
| renorm_momentum | Momentum used to update the moving means and standard
deviations with renorm. Unlike momentum, this affects training and
should be neither too small (which would add noise) nor too large (which
would give stale estimates). Note thatmomentumis still applied to
get the means and variances for inference. | 
| fused | if NoneorTrue, use a faster, fused implementation if
possible. IfFalse, use the system recommended implementation. | 
| trainable | Boolean, if Truealso add variables to the graph collectionGraphKeys.TRAINABLE_VARIABLES(see tf.Variable). | 
| virtual_batch_size | An int. By default,virtual_batch_sizeisNone,
which means batch normalization is performed across the whole batch.
Whenvirtual_batch_sizeis notNone, instead perform "Ghost Batch
Normalization", which creates virtual sub-batches which are each
normalized separately (with shared gamma, beta, and moving statistics).
Must divide the actual batch size during execution. | 
| adjustment | A function taking the Tensorcontaining the (dynamic) shape
of the input tensor and returning a pair (scale, bias) to apply to the
normalized values (before gamma and beta), only during training. For
example, if axis==-1,adjustment = lambda shape: (
    tf.random.uniform(shape[-1:], 0.93, 1.07),
    tf.random.uniform(shape[-1:], -0.1, 0.1))will scale the normalized
      value by up to 7% up or down, then shift the result by up to 0.1
      (with independent scaling and bias for each feature but shared
      across all examples), and finally apply gamma and/or beta. IfNone, no adjustment is applied. Cannot be specified if
      virtual_batch_size is specified. | 
| name | A string, the name of the layer. | 
| References | |
|---|---|
| Batch Normalization - Accelerating Deep Network Training by Reducing Internal Covariate Shift: Ioffe et al., 2015 (pdf) Batch Renormalization - Towards Reducing Minibatch Dependence in Batch-Normalized Models: Ioffe, 2017 (pdf) | 
| Attributes | |
|---|---|
| graph | |
| scope_name | |
Methods
apply
apply(
    *args, **kwargs
)
get_losses_for
get_losses_for(
    inputs
)
Retrieves losses relevant to a specific set of inputs.
| Args | |
|---|---|
| inputs | Input tensor or list/tuple of input tensors. | 
| Returns | |
|---|---|
| List of loss tensors of the layer that depend on inputs. | 
get_updates_for
get_updates_for(
    inputs
)
Retrieves updates relevant to a specific set of inputs.
| Args | |
|---|---|
| inputs | Input tensor or list/tuple of input tensors. | 
| Returns | |
|---|---|
| List of update ops of the layer that depend on inputs. |