View source on GitHub

SRU, Simple Recurrent Unit.

Inherits From: LayerRNNCell

Implementation based on Training RNNs as Fast as CNNs (cf.

This variation of RNN cell is characterized by the simplified data dependence between hidden states of two consecutive time steps. Traditionally, hidden states from a cell at time step t-1 needs to be multiplied with a matrix Whh before being fed into the ensuing cell at time step t. This flavor of RNN replaces the matrix multiplication between h{t-1} and W_hh with a pointwise multiplication, resulting in performance gain.

num_units int, The number of units in the SRU cell.
activation Nonlinearity to use. Default: tanh.
reuse (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
name (optional) String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.
**kwargs Additional keyword arguments.


output_size Integer or TensorShape: size of outputs produced by this cell.

state_size size(s) of state(s) used by this cell.

It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.



View source


View source

Return zero-filled state tensor(s).

batch_size int, float, or unit Tensor representing the batch size.
dtype the data type to use for the state.

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size, state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size, s] for each s in state_size.