|  View source on GitHub | 
Long Short-Term Memory layer - Hochreiter 1997.
Inherits From: RNN, Layer, Operation
tf.keras.layers.LSTM(
    units,
    activation='tanh',
    recurrent_activation='sigmoid',
    use_bias=True,
    kernel_initializer='glorot_uniform',
    recurrent_initializer='orthogonal',
    bias_initializer='zeros',
    unit_forget_bias=True,
    kernel_regularizer=None,
    recurrent_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    recurrent_constraint=None,
    bias_constraint=None,
    dropout=0.0,
    recurrent_dropout=0.0,
    seed=None,
    return_sequences=False,
    return_state=False,
    go_backwards=False,
    stateful=False,
    unroll=False,
    use_cudnn='auto',
    **kwargs
)
Used in the notebooks
| Used in the guide | Used in the tutorials | 
|---|---|
Based on available runtime hardware and constraints, this layer will choose different implementations (cuDNN-based or backend-native) to maximize the performance. If a GPU is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see below for details), the layer will use a fast cuDNN implementation when using the TensorFlow backend. The requirements to use the cuDNN implementation are:
- activation==- tanh
- recurrent_activation==- sigmoid
- dropout== 0 and- recurrent_dropout== 0
- unrollis- False
- use_biasis- True
- Inputs, if use masking, are strictly right-padded.
- Eager execution is enabled in the outermost context.
For example:
inputs = np.random.random((32, 10, 8))lstm = keras.layers.LSTM(4)output = lstm(inputs)output.shape(32, 4)lstm = keras.layers.LSTM(4, return_sequences=True, return_state=True)whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)whole_seq_output.shape(32, 10, 4)final_memory_state.shape(32, 4)final_carry_state.shape(32, 4)
| Args | |
|---|---|
| units | Positive integer, dimensionality of the output space. | 
| activation | Activation function to use.
Default: hyperbolic tangent ( tanh).
If you passNone, no activation is applied
(ie. "linear" activation:a(x) = x). | 
| recurrent_activation | Activation function to use
for the recurrent step.
Default: sigmoid ( sigmoid).
If you passNone, no activation is applied
(ie. "linear" activation:a(x) = x). | 
| use_bias | Boolean, (default True), whether the layer
should use a bias vector. | 
| kernel_initializer | Initializer for the kernelweights matrix,
used for the linear transformation of the inputs. Default:"glorot_uniform". | 
| recurrent_initializer | Initializer for the recurrent_kernelweights matrix, used for the linear transformation of the recurrent
state. Default:"orthogonal". | 
| bias_initializer | Initializer for the bias vector. Default: "zeros". | 
| unit_forget_bias | Boolean (default True). IfTrue,
add 1 to the bias of the forget gate at initialization.
Setting it toTruewill also forcebias_initializer="zeros".
This is recommended in Jozefowicz et al. | 
| kernel_regularizer | Regularizer function applied to the kernelweights
matrix. Default:None. | 
| recurrent_regularizer | Regularizer function applied to the recurrent_kernelweights matrix. Default:None. | 
| bias_regularizer | Regularizer function applied to the bias vector.
Default: None. | 
| activity_regularizer | Regularizer function applied to the output of the
layer (its "activation"). Default: None. | 
| kernel_constraint | Constraint function applied to the kernelweights
matrix. Default:None. | 
| recurrent_constraint | Constraint function applied to the recurrent_kernelweights matrix. Default:None. | 
| bias_constraint | Constraint function applied to the bias vector.
Default: None. | 
| dropout | Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. Default: 0. | 
| recurrent_dropout | Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. Default: 0. | 
| seed | Random seed for dropout. | 
| return_sequences | Boolean. Whether to return the last output
in the output sequence, or the full sequence. Default: False. | 
| return_state | Boolean. Whether to return the last state in addition
to the output. Default: False. | 
| go_backwards | Boolean (default: False).
IfTrue, process the input sequence backwards and return the
reversed sequence. | 
| stateful | Boolean (default: False). IfTrue, the last state
for each sample at index i in a batch will be used as initial
state for the sample of index i in the following batch. | 
| unroll | Boolean (default False).
If True, the network will be unrolled,
else a symbolic loop will be used.
Unrolling can speed-up a RNN,
although it tends to be more memory-intensive.
Unrolling is only suitable for short sequences. | 
| use_cudnn | Whether to use a cuDNN-backed implementation. "auto"will
attempt to use cuDNN when feasible, and will fallback to the
default implementation if not. | 
Methods
from_config
@classmethodfrom_config( config )
Creates a layer from its config.
This method is the reverse of get_config,
capable of instantiating the same layer from the config
dictionary. It does not handle layer connectivity
(handled by Network), nor weights (handled by set_weights).
| Args | |
|---|---|
| config | A Python dictionary, typically the output of get_config. | 
| Returns | |
|---|---|
| A layer instance. | 
get_initial_state
get_initial_state(
    batch_size
)
inner_loop
inner_loop(
    sequences, initial_state, mask, training=False
)
reset_state
reset_state()
reset_states
reset_states()
symbolic_call
symbolic_call(
    *args, **kwargs
)