Update '*var' according to the Adam algorithm.
tf.raw_ops.ApplyAdam(
var, m, v, beta1_power, beta2_power, lr, beta1, beta2, epsilon, grad,
use_locking=False, use_nesterov=False, name=None
)
$$lr_t := \text{learning\_rate} * \sqrt{1 - beta_2^t} / (1 - beta_1^t)$$
$$m_t := beta_1 * m_{t-1} + (1 - beta_1) * g$$
$$v_t := beta_2 * v_{t-1} + (1 - beta_2) * g * g$$
$$variable := variable - lr_t * m_t / (\sqrt{v_t} + \epsilon)$$
Args | |
|---|---|
var
|
A mutable Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, uint16, complex128, half, uint32, uint64.
Should be from a Variable().
|
m
|
A mutable Tensor. Must have the same type as var.
Should be from a Variable().
|
v
|
A mutable Tensor. Must have the same type as var.
Should be from a Variable().
|
beta1_power
|
A Tensor. Must have the same type as var.
Must be a scalar.
|
beta2_power
|
A Tensor. Must have the same type as var.
Must be a scalar.
|
lr
|
A Tensor. Must have the same type as var.
Scaling factor. Must be a scalar.
|
beta1
|
A Tensor. Must have the same type as var.
Momentum factor. Must be a scalar.
|
beta2
|
A Tensor. Must have the same type as var.
Momentum factor. Must be a scalar.
|
epsilon
|
A Tensor. Must have the same type as var.
Ridge term. Must be a scalar.
|
grad
|
A Tensor. Must have the same type as var. The gradient.
|
use_locking
|
An optional bool. Defaults to False.
If True, updating of the var, m, and v tensors will be protected
by a lock; otherwise the behavior is undefined, but may exhibit less
contention.
|
use_nesterov
|
An optional bool. Defaults to False.
If True, uses the nesterov update.
|
name
|
A name for the operation (optional). |
Returns | |
|---|---|
A mutable Tensor. Has the same type as var.
|