|  TensorFlow 2 version |  View source on GitHub | 
Optimizer that implements the FTRL algorithm.
Inherits From: Optimizer
tf.keras.optimizers.Ftrl(
    learning_rate=0.001, learning_rate_power=-0.5, initial_accumulator_value=0.1,
    l1_regularization_strength=0.0, l2_regularization_strength=0.0, name='Ftrl',
    l2_shrinkage_regularization_strength=0.0, **kwargs
)
See Algorithm 1 of this paper. This version has support for both online L2 (the L2 penalty given in the paper above) and shrinkage-type L2 (which is the addition of an L2 penalty to the loss function).
Initialization:
Update (
is variable index):
Check the documentation for the l2_shrinkage_regularization_strength parameter for more details when shrinkage is enabled, where gradient is replaced with gradient_with_shrinkage.
| Args | |
|---|---|
| learning_rate | A float value or a constant float Tensor. | 
| learning_rate_power | A float value, must be less or equal to zero. Controls how the learning rate decreases during training. Use zero for a fixed learning rate. | 
| initial_accumulator_value | The starting value for accumulators. Only zero or positive values are allowed. | 
| l1_regularization_strength | A float value, must be greater than or equal to zero. | 
| l2_regularization_strength | A float value, must be greater than or equal to zero. | 
| name | Optional name prefix for the operations created when applying gradients. Defaults to "Ftrl". | 
| l2_shrinkage_regularization_strength | A float value, must be greater than
or equal to zero. This differs from L2 above in that the L2 above is a
stabilization penalty, whereas this L2 shrinkage is a magnitude penalty.
The FTRL formulation can be written as:
w_{t+1} = argminw(\hat{g}{1:t}w + L1||w||_1 + L2||w||_2^2), where
\hat{g} = g + (2L2_shrinkagew), and g is the gradient of the loss
function w.r.t. the weights w.
Specifically, in the absence of L1 regularization, it is equivalent to
the following update rule:
w_{t+1} = w_t - lr_t / (1 + 2L2lr_t) * g_t -
2L2_shrinkagelr_t / (1 + 2L2lr_t) * w_t
where lr_t is the learning rate at t.
When input is sparse shrinkage will only happen on the active weights. | 
| **kwargs | keyword arguments. Allowed to be { clipnorm,clipvalue,lr,decay}.clipnormis clip gradients by norm;clipvalueis clip
gradients by value,decayis included for backward compatibility to
allow time inverse decay of learning rate.lris included for backward
compatibility, recommended to uselearning_rateinstead. | 
| Raises | |
|---|---|
| ValueError | If one of the arguments is invalid. | 
| Attributes | |
|---|---|
| iterations | Variable. The number of training steps this Optimizer has run. | 
| weights | Returns variables of this Optimizer based on the order created. | 
Methods
add_slot
add_slot(
    var, slot_name, initializer='zeros'
)
Add a new slot variable for var.
add_weight
add_weight(
    name, shape, dtype=None, initializer='zeros', trainable=None,
    synchronization=tf.VariableSynchronization.AUTO,
    aggregation=tf.VariableAggregation.NONE
)
apply_gradients
apply_gradients(
    grads_and_vars, name=None
)
Apply gradients to variables.
This is the second part of minimize(). It returns an Operation that
applies gradients.
| Args | |
|---|---|
| grads_and_vars | List of (gradient, variable) pairs. | 
| name | Optional name for the returned operation.  Default to the name
passed to the Optimizerconstructor. | 
| Returns | |
|---|---|
| An Operationthat applies the specified gradients. Theiterationswill be automatically increased by 1. | 
| Raises | |
|---|---|
| TypeError | If grads_and_varsis malformed. | 
| ValueError | If none of the variables have gradients. | 
from_config
@classmethodfrom_config( config, custom_objects=None )
Creates an optimizer from its config.
This method is the reverse of get_config,
capable of instantiating the same optimizer from the config
dictionary.
| Arguments | |
|---|---|
| config | A Python dictionary, typically the output of get_config. | 
| custom_objects | A Python dictionary mapping names to additional Python objects used to create this optimizer, such as a function used for a hyperparameter. | 
| Returns | |
|---|---|
| An optimizer instance. | 
get_config
get_config()
Returns the config of the optimimizer.
An optimizer config is a Python dictionary (serializable) containing the configuration of an optimizer. The same optimizer can be reinstantiated later (without any saved state) from this configuration.
| Returns | |
|---|---|
| Python dictionary. | 
get_gradients
get_gradients(
    loss, params
)
Returns gradients of loss with respect to params.
| Arguments | |
|---|---|
| loss | Loss tensor. | 
| params | List of variables. | 
| Returns | |
|---|---|
| List of gradient tensors. | 
| Raises | |
|---|---|
| ValueError | In case any gradient cannot be computed (e.g. if gradient function not implemented). | 
get_slot
get_slot(
    var, slot_name
)
get_slot_names
get_slot_names()
A list of names for this optimizer's slots.
get_updates
get_updates(
    loss, params
)
get_weights
get_weights()
minimize
minimize(
    loss, var_list, grad_loss=None, name=None
)
Minimize loss by updating var_list.
This method simply computes gradient using tf.GradientTape and calls
apply_gradients(). If you want to process the gradient before applying
then call tf.GradientTape and apply_gradients() explicitly instead
of using this function.
| Args | |
|---|---|
| loss | A callable taking no arguments which returns the value to minimize. | 
| var_list | list or tuple of Variableobjects to update to minimizeloss, or a callable returning the list or tuple ofVariableobjects.
Use callable when the variable list would otherwise be incomplete beforeminimizesince the variables are created at the first timelossis
called. | 
| grad_loss | Optional. A Tensorholding the gradient computed forloss. | 
| name | Optional name for the returned operation. | 
| Returns | |
|---|---|
| An Operation that updates the variables in var_list.  Ifglobal_stepwas notNone, that operation also incrementsglobal_step. | 
| Raises | |
|---|---|
| ValueError | If some of the variables are not Variableobjects. | 
set_weights
set_weights(
    weights
)
variables
variables()
Returns variables of this Optimizer based on the order created.