Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge


Returns a tff.learning.optimizers.Optimizer for momentum SGD.

This class supports the simple gradient descent and its variant with momentum.

If momentum is not used, the update rule given learning rate lr, weights w and gradients g is:

w = w - lr * g

If momentum m (a float between 0.0 and 1.0) is used, the update rule is

v = m * v + g
w = w - lr * v

where v is the velocity from previous steps of the optimizer.

learning_rate A positive float for learning rate, default to 0.01.
momentum A float between 0.0 and 1.0 for momentum.