It differs from platform-independent GRUs in how the new memory gate is
calculated. Nvidia picks this variant based on GRU author's[1] suggestion and
the fact it has no accuracy impact[2].
[1] https://arxiv.org/abs/1406.1078
[2] http://svail.github.io/diff_graphs/
Cudnn compatible GRU (from Cudnn library user guide):
int, float, or unit Tensor representing the batch size.
dtype
the data type to use for the state.
Returns
If state_size is an int or TensorShape, then the return value is a
N-D tensor of shape [batch_size, state_size] filled with zeros.
If state_size is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D tensors with
the shapes [batch_size, s] for each s in state_size.
[null,null,["Last updated 2020-10-01 UTC."],[],[],null,["# tf.contrib.cudnn_rnn.CudnnCompatibleGRUCell\n\n|--------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/cudnn_rnn/python/ops/cudnn_rnn_ops.py#L81-L180) |\n\nCudnn Compatible GRUCell.\n\nInherits From: [`GRUCell`](../../../tf/nn/rnn_cell/GRUCell) \n\n tf.contrib.cudnn_rnn.CudnnCompatibleGRUCell(\n num_units, reuse=None, kernel_initializer=None\n )\n\nA GRU impl akin to [`tf.compat.v1.nn.rnn_cell.GRUCell`](../../../tf/nn/rnn_cell/GRUCell) to use along with\n[`tf.contrib.cudnn_rnn.CudnnGRU`](../../../tf/contrib/cudnn_rnn/CudnnGRU). The latter's params can be used by\nit seamlessly.\n\nIt differs from platform-independent GRUs in how the new memory gate is\ncalculated. Nvidia picks this variant based on GRU author's\\[1\\] suggestion and\nthe fact it has no accuracy impact\\[2\\].\n\\[1\\] \u003chttps://arxiv.org/abs/1406.1078\u003e\n\\[2\\] \u003chttp://svail.github.io/diff_graphs/\u003e\n\nCudnn compatible GRU (from Cudnn library user guide): \n\n # reset gate\n\n\n \u003cdiv\u003e $$r_t = \\sigma(x_t * W_r + h_t-1 * R_h + b_{Wr} + b_{Rr})$$ \u003c/div\u003e\n\n\n # update gate\n\n\n \u003cdiv\u003e $$u_t = \\sigma(x_t * W_u + h_t-1 * R_u + b_{Wu} + b_{Ru})$$ \u003c/div\u003e\n\n\n # new memory gate\n\n\n \u003cdiv\u003e $$h'_t = tanh(x_t * W_h + r_t .* (h_t-1 * R_h + b_{Rh}) + b_{Wh})$$ \u003c/div\u003e\n\n\n\n\n \u003cdiv\u003e $$h_t = (1 - u_t) .* h'_t + u_t .* h_t-1$$ \u003c/div\u003e\n\n\nOther GRU (see [`tf.compat.v1.nn.rnn_cell.GRUCell`](../../../tf/nn/rnn_cell/GRUCell) and\n[`tf.contrib.rnn.GRUBlockCell`](../../../tf/contrib/rnn/GRUBlockCell)): \n\n # new memory gate\n \\\\(h'_t = tanh(x_t * W_h + (r_t .* h_t-1) * R_h + b_{Wh})\\\\)\n\nwhich is not equivalent to Cudnn GRU: in addition to the extra bias term b_Rh, \n\n \\\\(r .* (h * R) != (r .* h) * R\\\\)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `graph` | DEPRECATED FUNCTION \u003cbr /\u003e | **Warning:** THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Stop using this property because tf.layers layers no longer track their graph. |\n| `output_size` | Integer or TensorShape: size of outputs produced by this cell. |\n| `scope_name` | \u003cbr /\u003e |\n| `state_size` | size(s) of state(s) used by this cell. \u003cbr /\u003e It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `get_initial_state`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/ops/rnn_cell_impl.py#L281-L309) \n\n get_initial_state(\n inputs=None, batch_size=None, dtype=None\n )\n\n### `zero_state`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/ops/rnn_cell_impl.py#L311-L340) \n\n zero_state(\n batch_size, dtype\n )\n\nReturn zero-filled state tensor(s).\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|--------------|---------------------------------------------------------|\n| `batch_size` | int, float, or unit Tensor representing the batch size. |\n| `dtype` | the data type to use for the state. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| If `state_size` is an int or TensorShape, then the return value is a `N-D` tensor of shape `[batch_size, state_size]` filled with zeros. \u003cbr /\u003e If `state_size` is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of `2-D` tensors with the shapes `[batch_size, s]` for each s in `state_size`. ||\n\n\u003cbr /\u003e"]]