Module: tf.compat.v1.nn

Primitive Neural Net (NN) Operations.

Notes on padding

Several neural network operations, such as tf.nn.conv2d and tf.nn.max_pool2d, take a padding parameter, which controls how the input is padded before running the operation. The input is padded by inserting values (typically zeros) before and after the tensor in each spatial dimension. The padding parameter can either be the string 'VALID', which means use no padding, or 'SAME' which adds padding according to a formula which is described below. Certain ops also allow the amount of padding per dimension to be explicitly specified by passing a list to padding.

In the case of convolutions, the input is padded with zeros. In case of pools, the padded input values are ignored. For example, in a max pool, the sliding window ignores padded values, which is equivalent to the padded values being -infinity.

'VALID' padding

Passing padding='VALID' to an op causes no padding to be used. This causes the output size to typically be smaller than the input size, even when the stride is one. In the 2D case, the output size is computed as:

out_height = ceil((in_height - filter_height + 1) / stride_height)
out_width  = ceil((in_width - filter_width + 1) / stride_width)

The 1D and 3D cases are similar. Note filter_height and filter_width refer to the filter size after dilations (if any) for convolutions, and refer to the window size for pools.

'SAME' padding

With 'SAME' padding, padding is applied to each spatial dimension. When the strides are 1, the input is padded such that the output size is the same as the input size. In the 2D case, the output size is computed as:

out_height = ceil(in_height / stride_height)
out_width  = ceil(in_width / stride_width)

The amount of padding used is the smallest amount that results in the output size. The formula for the total amount of padding per dimension is:

if (in_height % strides[1] == 0):
  pad_along_height = max(filter_height - stride_height, 0)
else:
  pad_along_height = max(filter_height - (in_height % stride_height), 0)
if (in_width % strides[2] == 0):
  pad_along_width = max(filter_width - stride_width, 0)
else:
  pad_along_width = max(filter_width - (in_width % stride_width), 0)

Finally, the padding on the top, bottom, left and right are:

pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

Note that the division by 2 means that there might be cases when the padding on both sides (top vs bottom, right vs left) are off by one. In this case, the bottom and right sides always get the one additional padded pixel. For example, when pad_along_height is 5, we pad 2 pixels at the top and 3 pixels at the bottom. Note that this is different from existing libraries such as PyTorch and Caffe, which explicitly specify the number of padded pixels and always pad the same number of pixels on both sides.

Here is an example of 'SAME' padding:

in_height = 5
filter_height = 3
stride_height = 2

in_width = 2
filter_width = 2
stride_width = 1

inp = tf.ones((2, in_height, in_width, 2))
filter = tf.ones((filter_height, filter_width, 2, 2))
strides = [stride_height, stride_width]
output = tf.nn.conv2d(inp, filter, strides, padding='SAME')
output.shape[1]  # output_height: ceil(5 / 2)
3
output.shape[2] # output_width: ceil(2 / 1)
2

Explicit padding

Certain ops, like tf.nn.conv2d, also allow a list of explicit padding amounts to be passed to the padding parameter. This list is in the same format as what is passed to tf.pad, except the padding must be a nested list, not a tensor. For example, in the 2D case, the list is in the format [[0, 0], [pad_top, pad_bottom], [pad_left, pad_right], [0, 0]] when data_format is its default value of 'NHWC'. The two [0, 0] pairs indicate the batch and channel dimensions have no padding, which is required, as only spatial dimensions can have padding.

For example:

inp = tf.ones((1, 3, 3, 1))
filter = tf.ones((2, 2, 1, 1))
strides = [1, 1]
padding = [[0, 0], [1, 2], [0, 1], [0, 0]]
output = tf.nn.conv2d(inp, filter, strides, padding=padding)
tuple(output.shape)
(1, 5, 3, 1)
# Equivalently, tf.pad can be used, since convolutions pad with zeros.
inp = tf.pad(inp, padding)
# 'VALID' means to use no padding in conv2d (we already padded inp)
output2 = tf.nn.conv2d(inp, filter, strides, padding='VALID')
tf.debugging.assert_equal(output, output2)

Modules

experimental module: Public API for tf.nn.experimental namespace.

rnn_cell module: Public API for tf.keras.internal.legacy.rnn_cell namespace.

Functions

all_candidate_sampler(...): Generate the set of all classes.

atrous_conv2d(...): Atrous convolution (a.k.a. convolution with holes or dilated convolution).

atrous_conv2d_transpose(...): The transpose of atrous_conv2d.

avg_pool(...): Performs the average pooling on the input.

avg_pool1d(...): Performs the average pooling on the input.

avg_pool2d(...): Performs the average pooling on the input.

avg_pool3d(...): Performs the average pooling on the input.

avg_pool_v2(...): Performs the avg pooling on the input.

batch_norm_with_global_normalization(...): Batch normalization.

batch_normalization(...): Batch normalization.

bias_add(...): Adds bias to value.

bidirectional_dynamic_rnn(...): Creates a dynamic version of bidirectional recurrent neural network. (deprecated)

collapse_repeated(...): Merge repeated labels into single labels.

compute_accidental_hits(...): Compute the position ids in sampled_candidates matching true_classes.

compute_average_loss(...): Scales per-example losses with sample_weights and computes their average.

conv1d(...): Computes a 1-D convolution of input with rank >=3 and a 3-D filter. (deprecated argument values) (deprecated argument values)

conv1d_transpose(...): The transpose of conv1d.

conv2d(...): Computes a 2-D convolution given 4-D input and filter tensors.

conv2d_backprop_filter(...): Computes the gradients of convolution with respect to the filter.

conv2d_backprop_input(...): Computes the gradients of convolution with respect to the input.

conv2d_transpose(...): The transpose of conv2d.

conv3d(...): Computes a 3-D convolution given 5-D input and filter tensors.

conv3d_backprop_filter(...): Computes the gradients of 3-D convolution with respect to the filter.

conv3d_backprop_filter_v2(...): Computes the gradients of 3-D convolution with respect to the filter.

conv3d_transpose(...): The transpose of conv3d.

conv_transpose(...): The transpose of convolution.

convolution(...): Computes sums of N-D convolutions (actually cross-correlation).

crelu(...): Computes Concatenated ReLU.

ctc_beam_search_decoder(...): Performs beam search decoding on the logits given in input.

ctc_beam_search_decoder_v2(...): Performs beam search decoding on the logits given in input.

ctc_greedy_decoder(...): Performs greedy decoding on the logits given in input (best path).

ctc_loss(...): Computes the CTC (Connectionist Temporal Classification) Loss.

ctc_loss_v2(...): Computes CTC (Connectionist Temporal Classification) loss.

ctc_unique_labels(...): Get unique labels and indices for batched labels for tf.nn.ctc_loss.

depth_to_space(...): DepthToSpace for tensors of type T.

depthwise_conv2d(...): Depthwise 2-D convolution.

depthwise_conv2d_backprop_filter(...): Computes the gradients of depthwise convolution with respect to the filter.

depthwise_conv2d_backprop_input(...): Computes the gradients of depthwise convolution with respect to the input.

depthwise_conv2d_native(...): Computes a 2-D depthwise convolution.

depthwise_conv2d_native_backprop_filter(...): Computes the gradients of depthwise convolution with respect to the filter.

depthwise_conv2d_native_backprop_input(...): Computes the gradients of depthwise convolution with respect to the input.

dilation2d(...): Computes the grayscale dilation of 4-D input and 3-D filter tensors.

dropout(...): Computes dropout. (deprecated arguments)

dynamic_rnn(...): Creates a recurrent neural network specified by RNNCell cell. (deprecated)

elu(...): Computes the exponential linear function.

embedding_lookup(...): Looks up embeddings for the given ids from a list of tensors.

embedding_lookup_sparse(...): Looks up embeddings for the given ids and weights from a list of tensors.

erosion2d(...): Computes the grayscale erosion of 4-D value and 3-D kernel tensors.

fixed_unigram_candidate_sampler(...): Samples a set of classes using the provided (fixed) base distribution.

fractional_avg_pool(...): Performs fractional average pooling on the input. (deprecated)

fractional_max_pool(...): Performs fractional max pooling on the input. (deprecated)

fused_batch_norm(...): Batch normalization.

in_top_k(...): Says whether the targets are in the top K predictions.

l2_loss(...): L2 Loss.

l2_normalize(...): Normalizes along dimension axis using an L2 norm. (deprecated arguments)

leaky_relu(...): Compute the Leaky ReLU activation function.

learned_unigram_candidate_sampler(...): Samples a set of classes from a distribution learned during training.

local_response_normalization(...): Local Response Normalization.

log_poisson_loss(...): Computes log Poisson loss given log_input.

log_softmax(...): Computes log softmax activations. (deprecated arguments)

log_uniform_candidate_sampler(...): Samples a set of classes using a log-uniform (Zipfian) base distribution.

lrn(...): Local Response Normalization.

max_pool(...): Performs the max pooling on the input.

max_pool1d(...): Performs the max pooling on the input.

max_pool2d(...): Performs max pooling on 2D spatial data such as images.

max_pool3d(...): Performs the max pooling on the input.

max_pool_v2(...): Performs max pooling on the input.

max_pool_with_argmax(...): Performs max pooling on the input and outputs both max values and indices.

moments(...): Calculate the mean and variance of x.

nce_loss(...): Computes and returns the noise-contrastive estimation training loss.

normalize_moments(...): Calculate the mean and variance of based on the sufficient statistics.

pool(...): Performs an N-D pooling operation.

quantized_avg_pool(...): Produces the average pool of the input tensor for quantized types.

quantized_conv2d(...): Computes a 2D convolution given quantized 4D input and filter tensors.

quantized_max_pool(...): Produces the max pool of the input tensor for quantized types.

quantized_relu_x(...): Computes Quantized Rectified Linear X: min(max(features, 0), max_value)

raw_rnn(...): Creates an RNN specified by RNNCell cell and loop function loop_fn.

relu(...): Computes rectified linear: max(features, 0).

relu6(...): Computes Rectified Linear 6: min(max(features, 0), 6).

relu_layer(...): Computes Relu(x * weight + biases).

safe_embedding_lookup_sparse(...): Lookup embedding results, accounting for invalid IDs and empty features.

sampled_softmax_loss(...): Computes and returns the sampled softmax training loss.

scale_regularization_loss(...): Scales the sum of the given regularization losses by number of replicas.

selu(...): Computes scaled exponential linear: scale * alpha * (exp(features) - 1)

separable_conv2d(...): 2-D convolution with separable filters.

sigmoid(...): Computes sigmoid of x element-wise.

sigmoid_cross_entropy_with_logits(...): Computes sigmoid cross entropy given logits.

silu(...): Computes the SiLU or Swish activation function: x * sigmoid(beta * x).

softmax(...): Computes softmax activations.

softmax_cross_entropy_with_logits(...): Computes softmax cross entropy between logits and labels. (deprecated)

softmax_cross_entropy_with_logits_v2(...): Computes softmax cross entropy between logits and labels. (deprecated arguments)

softplus(...): Computes elementwise softplus: softplus(x) = log(exp(x) + 1).

softsign(...): Computes softsign: features / (abs(features) + 1).

space_to_batch(...): SpaceToBatch for 4-D tensors of type T.

space_to_depth(...): SpaceToDepth for tensors of type T.

sparse_softmax_cross_entropy_with_logits(...): Computes sparse softmax cross entropy between logits and labels.

static_bidirectional_rnn(...): Creates a bidirectional recurrent neural network. (deprecated)

static_rnn(...): Creates a recurrent neural network specified by RNNCell cell. (deprecated)

static_state_saving_rnn(...): RNN that accepts a state saver for time-truncated RNN calculation. (deprecated)

sufficient_statistics(...): Calculate the sufficient statistics for the mean and variance of x.

swish(...): Computes the SiLU or Swish activation function: x * sigmoid(beta * x).

tanh(...): Computes hyperbolic tangent of x element-wise.

top_k(...): Finds values and indices of the k largest entries for the last dimension.

uniform_candidate_sampler(...): Samples a set of classes using a uniform base distribution.

weighted_cross_entropy_with_logits(...): Computes a weighted cross entropy. (deprecated arguments)

weighted_moments(...): Returns the frequency-weighted mean and variance of x.

with_space_to_batch(...): Performs op on the space-to-batch representation of input.

xw_plus_b(...): Computes matmul(x, weights) + biases.

zero_fraction(...): Returns the fraction of zeros in value.