That is for rows we have grad for, we update var and accum as follows:
accum+=grad∗grad
proxv=var
proxv−=lr∗grad∗(1/sqrt(accum))
var=sign(proxv)/(1+lr∗l2)∗max|proxv|−lr∗l1,0
Args
var
A mutable Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, qint16, quint16, uint16, complex128, half, uint32, uint64.
Should be from a Variable().
accum
A mutable Tensor. Must have the same type as var.
Should be from a Variable().
lr
A Tensor. Must have the same type as var.
Learning rate. Must be a scalar.
l1
A Tensor. Must have the same type as var.
L1 regularization. Must be a scalar.
l2
A Tensor. Must have the same type as var.
L2 regularization. Must be a scalar.
grad
A Tensor. Must have the same type as var. The gradient.
indices
A Tensor. Must be one of the following types: int32, int64.
A vector of indices into the first dimension of var and accum.
use_locking
An optional bool. Defaults to False.
If True, updating of the var and accum tensors will be protected by
a lock; otherwise the behavior is undefined, but may exhibit less contention.