Returns a tff.learning.optimizers.Optimizer for RMSprop.

The RMSprop optimizer is based on Tieleman and Hinton, 2012.

The update rule given learning rate lr, epsilon eps, decay d, preconditioner s, weights w and gradients g is:

s = d * s + (1 - d) * g**2
w = w - lr * g / (sqrt(s) + eps)

learning_rate A positive float for learning rate, default to 0.01.
decay A float between 0.0 and 1.0 for the decay used to track the magnitude of previous gradients.
epsilon A small non-negative float, used to maintain numerical stability.