![]() |
Returns a tff.learning.optimizers.Optimizer
for Adam.
tff.learning.optimizers.build_adam(
learning_rate: float,
beta_1: float = 0.9,
beta_2: float = 0.999,
epsilon: float = 1e-07
) -> tff.learning.optimizers.Optimizer
The Adam optimizer is based on Adam: A Method for Stochastic Optimization
The update rule given learning rate lr
, epsilon eps
, accumulator acc
,
preconditioner s
, iteration t
, weights w
and gradients g
is:
acc = beta_1 * acc + (1 - beta_1) * g
s = beta_2 * s + (1 - beta_2) * g**2
normalized_lr = lr * sqrt(1 - beta_2**t) / (1 - beta_1**t)
w = w - normalized_lr * acc / (sqrt(s) + eps)
Args | |
---|---|
learning_rate
|
A positive float for learning rate.
|
beta_1
|
A float between 0.0 and 1.0 for the decay used to track the
previous gradients.
|
beta_2
|
A float between 0.0 and 1.0 for the decay used to track the
magnitude (second moment) of previous gradients.
|
epsilon
|
A small non-negative float , used to maintain numerical stability.
|