|  TensorFlow 1 version |  View source on GitHub | 
Optimizer that implements the NAdam algorithm.
Inherits From: Optimizer
tf.keras.optimizers.Nadam(
    learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, name='Nadam',
    **kwargs
)
Much like Adam is essentially RMSprop with momentum, Nadam is Adam with Nesterov momentum.
Initialization:
Computes:
gradient is evaluated at theta(t) + momentum * v(t), and the variables always store theta + beta_1 * m / sqrt(v) instead of theta.
References See Dozat, T., 2015.
| Args | |
|---|---|
| learning_rate | A Tensor or a floating point value. The learning rate. | 
| beta_1 | A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates. | 
| beta_2 | A float value or a constant float tensor. The exponential decay rate for the exponentially weighted infinity norm. | 
| epsilon | A small constant for numerical stability. | 
| name | Optional name for the operations created when applying gradients. Defaults to "Adamax". | 
| **kwargs | keyword arguments. Allowed to be { clipnorm,clipvalue,lr,decay}.clipnormis clip gradients by norm;clipvalueis clip
gradients by value,decayis included for backward compatibility to
allow time inverse decay of learning rate.lris included for backward
compatibility, recommended to uselearning_rateinstead. | 
| Attributes | |
|---|---|
| iterations | Variable. The number of training steps this Optimizer has run. | 
| weights | Returns variables of this Optimizer based on the order created. | 
Methods
add_slot
add_slot(
    var, slot_name, initializer='zeros'
)
Add a new slot variable for var.
add_weight
add_weight(
    name, shape, dtype=None, initializer='zeros', trainable=None,
    synchronization=tf.VariableSynchronization.AUTO,
    aggregation=tf.compat.v1.VariableAggregation.NONE
)
apply_gradients
apply_gradients(
    grads_and_vars, name=None
)
Apply gradients to variables.
This is the second part of minimize(). It returns an Operation that
applies gradients.
| Args | |
|---|---|
| grads_and_vars | List of (gradient, variable) pairs. | 
| name | Optional name for the returned operation.  Default to the name
passed to the Optimizerconstructor. | 
| Returns | |
|---|---|
| An Operationthat applies the specified gradients. Theiterationswill be automatically increased by 1. | 
| Raises | |
|---|---|
| TypeError | If grads_and_varsis malformed. | 
| ValueError | If none of the variables have gradients. | 
from_config
@classmethodfrom_config( config, custom_objects=None )
Creates an optimizer from its config.
This method is the reverse of get_config,
capable of instantiating the same optimizer from the config
dictionary.
| Arguments | |
|---|---|
| config | A Python dictionary, typically the output of get_config. | 
| custom_objects | A Python dictionary mapping names to additional Python objects used to create this optimizer, such as a function used for a hyperparameter. | 
| Returns | |
|---|---|
| An optimizer instance. | 
get_config
get_config()
Returns the config of the optimimizer.
An optimizer config is a Python dictionary (serializable) containing the configuration of an optimizer. The same optimizer can be reinstantiated later (without any saved state) from this configuration.
| Returns | |
|---|---|
| Python dictionary. | 
get_gradients
get_gradients(
    loss, params
)
Returns gradients of loss with respect to params.
| Arguments | |
|---|---|
| loss | Loss tensor. | 
| params | List of variables. | 
| Returns | |
|---|---|
| List of gradient tensors. | 
| Raises | |
|---|---|
| ValueError | In case any gradient cannot be computed (e.g. if gradient function not implemented). | 
get_slot
get_slot(
    var, slot_name
)
get_slot_names
get_slot_names()
A list of names for this optimizer's slots.
get_updates
get_updates(
    loss, params
)
get_weights
get_weights()
minimize
minimize(
    loss, var_list, grad_loss=None, name=None
)
Minimize loss by updating var_list.
This method simply computes gradient using tf.GradientTape and calls
apply_gradients(). If you want to process the gradient before applying
then call tf.GradientTape and apply_gradients() explicitly instead
of using this function.
| Args | |
|---|---|
| loss | A callable taking no arguments which returns the value to minimize. | 
| var_list | list or tuple of Variableobjects to update to minimizeloss, or a callable returning the list or tuple ofVariableobjects.
Use callable when the variable list would otherwise be incomplete beforeminimizesince the variables are created at the first timelossis
called. | 
| grad_loss | Optional. A Tensorholding the gradient computed forloss. | 
| name | Optional name for the returned operation. | 
| Returns | |
|---|---|
| An Operationthat updates the variables invar_list. Theiterationswill be automatically increased by 1. | 
| Raises | |
|---|---|
| ValueError | If some of the variables are not Variableobjects. | 
set_weights
set_weights(
    weights
)
variables
variables()
Returns variables of this Optimizer based on the order created.