View source on GitHub

Independently Gated Recurrent Unit cell.

Inherits From: LayerRNNCell

Based on IndRNNs ( and similar to GRUCell, yet with the \(U_r\), \(U_z\), and \(U\) matrices in equations 5, 6, and 8 of respectively replaced by diagonal matrices, i.e. a Hadamard product with a single vector:

$$r_j = \sigma\left([\mathbf W_r\mathbf x]_j + [\mathbf u_r\circ \mathbf h_{(t-1)}]_j\right)$$
$$z_j = \sigma\left([\mathbf W_z\mathbf x]_j + [\mathbf u_z\circ \mathbf h_{(t-1)}]_j\right)$$
$$\tilde{h}^{(t)}_j = \phi\left([\mathbf W \mathbf x]_j + [\mathbf u \circ \mathbf r \circ \mathbf h_{(t-1)}]_j\right)$$

where \(\circ\) denotes the Hadamard operator. This means that each IndyGRU node sees only its own state, as opposed to seeing all states in the same layer.

num_units int, The number of units in the GRU cell.
activation Nonlinearity to use. Default: tanh.
reuse (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
kernel_initializer (optional) The initializer to use for the weight matrices applied to the input.
bias_initializer (optional) The initializer to use for the bias.
name String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.
dtype Default dtype of the layer (default of None means use the type of the first input). Required when build is called before call.


output_size Integer or TensorShape: size of outputs produced by this cell.

state_size size(s) of state(s) used by this cell.

It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.



View source


View source

Return zero-filled state tensor(s).

batch_size int, float, or unit Tensor representing the batch size.
dtype the data type to use for the state.

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size, state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size, s] for each s in state_size.