tf.raw_ops.ApplyAdaMax
Update '*var' according to the AdaMax algorithm.
tf.raw_ops.ApplyAdaMax(
var,
m,
v,
beta1_power,
lr,
beta1,
beta2,
epsilon,
grad,
use_locking=False,
name=None
)
mt <- beta1 * m{t-1} + (1 - beta1) * g
vt <- max(beta2 * v{t-1}, abs(g))
variable <- variable - learning_rate / (1 - beta1^t) * m_t / (v_t + epsilon)
Args |
var
|
A mutable Tensor . Must be one of the following types: float32 , float64 , int32 , uint8 , int16 , int8 , complex64 , int64 , qint8 , quint8 , qint32 , bfloat16 , qint16 , quint16 , uint16 , complex128 , half , uint32 , uint64 .
Should be from a Variable().
|
m
|
A mutable Tensor . Must have the same type as var .
Should be from a Variable().
|
v
|
A mutable Tensor . Must have the same type as var .
Should be from a Variable().
|
beta1_power
|
A Tensor . Must have the same type as var .
Must be a scalar.
|
lr
|
A Tensor . Must have the same type as var .
Scaling factor. Must be a scalar.
|
beta1
|
A Tensor . Must have the same type as var .
Momentum factor. Must be a scalar.
|
beta2
|
A Tensor . Must have the same type as var .
Momentum factor. Must be a scalar.
|
epsilon
|
A Tensor . Must have the same type as var .
Ridge term. Must be a scalar.
|
grad
|
A Tensor . Must have the same type as var . The gradient.
|
use_locking
|
An optional bool . Defaults to False .
If True , updating of the var, m, and v tensors will be protected
by a lock; otherwise the behavior is undefined, but may exhibit less
contention.
|
name
|
A name for the operation (optional).
|
Returns |
A mutable Tensor . Has the same type as var .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-01-23 UTC.
[null,null,["Last updated 2024-01-23 UTC."],[],[]]