Builds an Adafactor optimizer.

An implementation of Adafactor from Shazeer, Noam et al described in

learning_rate Initial value of the learning rate.
beta_2_decay The decay rate of beta_2.
epsilon_1 A small offset to keep denomiantor away from zero.
epsilon_2 A small offset to avoid learning rate becoming two small over time.
clip_threshold The clipping threshold of the Adafactor algorithm.
relative_step If True, learning rate is adjusted based on number of iterations. This is the default Adafactor learning rate decay.

A tff.learning.optimizers.Optimizer that implements the Adafactor optimizer.