Update '*var' according to the Adam algorithm.
tf.raw_ops.ApplyAdam(
    var, m, v, beta1_power, beta2_power, lr, beta1, beta2, epsilon, grad,
    use_locking=False, use_nesterov=False, name=None
)
 $$lr_t := \text{learning\_rate} * \sqrt{1 - beta_2^t} / (1 - beta_1^t)$$ 
 $$m_t := beta_1 * m_{t-1} + (1 - beta_1) * g$$ 
 $$v_t := beta_2 * v_{t-1} + (1 - beta_2) * g * g$$ 
 $$variable := variable - lr_t * m_t / (\sqrt{v_t} + \epsilon)$$ 
Args | |
|---|---|
var
 | 
A mutable Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, uint16, complex128, half, uint32, uint64.
Should be from a Variable().
 | 
m
 | 
A mutable Tensor. Must have the same type as var.
Should be from a Variable().
 | 
v
 | 
A mutable Tensor. Must have the same type as var.
Should be from a Variable().
 | 
beta1_power
 | 
A Tensor. Must have the same type as var.
Must be a scalar.
 | 
beta2_power
 | 
A Tensor. Must have the same type as var.
Must be a scalar.
 | 
lr
 | 
A Tensor. Must have the same type as var.
Scaling factor. Must be a scalar.
 | 
beta1
 | 
A Tensor. Must have the same type as var.
Momentum factor. Must be a scalar.
 | 
beta2
 | 
A Tensor. Must have the same type as var.
Momentum factor. Must be a scalar.
 | 
epsilon
 | 
A Tensor. Must have the same type as var.
Ridge term. Must be a scalar.
 | 
grad
 | 
A Tensor. Must have the same type as var. The gradient.
 | 
use_locking
 | 
An optional bool. Defaults to False.
If True, updating of the var, m, and v tensors will be protected
by a lock; otherwise the behavior is undefined, but may exhibit less
contention.
 | 
use_nesterov
 | 
An optional bool. Defaults to False.
If True, uses the nesterov update.
 | 
name
 | 
A name for the operation (optional). | 
Returns | |
|---|---|
A mutable Tensor. Has the same type as var.
 |