TensorFlow 1 version
 | 
  
     
    View source on GitHub
  
 | 
Optimizer that implements the FTRL algorithm.
Inherits From: Optimizer
tf.keras.optimizers.Ftrl(
    learning_rate=0.001, learning_rate_power=-0.5, initial_accumulator_value=0.1,
    l1_regularization_strength=0.0, l2_regularization_strength=0.0,
    name='Ftrl', l2_shrinkage_regularization_strength=0.0, beta=0.0,
    **kwargs
)
See Algorithm 1 of this paper. This version has support for both online L2 (the L2 penalty given in the paper above) and shrinkage-type L2 (which is the addition of an L2 penalty to the loss function).
Initialization:
Update (
is variable index,
is the learning rate):
Check the documentation for the l2_shrinkage_regularization_strength parameter for more details when shrinkage is enabled, in which case gradient is replaced with gradient_with_shrinkage.
Args | |
|---|---|
learning_rate
 | 
A Tensor, floating point value, or a schedule that is a
tf.keras.optimizers.schedules.LearningRateSchedule. The learning rate.
 | 
learning_rate_power
 | 
A float value, must be less or equal to zero. Controls how the learning rate decreases during training. Use zero for a fixed learning rate. | 
initial_accumulator_value
 | 
The starting value for accumulators. Only zero or positive values are allowed. | 
l1_regularization_strength
 | 
A float value, must be greater than or equal to zero. | 
l2_regularization_strength
 | 
A float value, must be greater than or equal to zero. | 
name
 | 
Optional name prefix for the operations created when applying
gradients.  Defaults to "Ftrl".
 | 
l2_shrinkage_regularization_strength
 | 
A float value, must be greater than or equal to zero. This differs from L2 above in that the L2 above is a stabilization penalty, whereas this L2 shrinkage is a magnitude penalty. When input is sparse shrinkage will only happen on the active weights. | 
beta
 | 
A float value, representing the beta value from the paper. | 
**kwargs
 | 
Keyword arguments. Allowed to be one of
"clipnorm" or "clipvalue".
"clipnorm" (float) clips gradients by norm; "clipvalue" (float) clips
gradients by value.
 | 
Reference:
Raises | |
|---|---|
ValueError
 | 
in case of any invalid argument. | 
  TensorFlow 1 version
    View source on GitHub