View source on GitHub |
Optimization parameters for Ftrl with TPU embeddings.
tf.compat.v1.tpu.experimental.FtrlParameters(
learning_rate, learning_rate_power=-0.5, initial_accumulator_value=0.1,
l1_regularization_strength=0.0, l2_regularization_strength=0.0,
use_gradient_accumulation=True, clip_weight_min=None, clip_weight_max=None,
weight_decay_factor=None, multiply_weight_decay_factor_by_learning_rate=None
)
Pass this to tf.estimator.tpu.experimental.EmbeddingConfigSpec
via the
optimization_parameters
argument to set the optimizer and its parameters.
See the documentation for tf.estimator.tpu.experimental.EmbeddingConfigSpec
for more details.
estimator = tf.estimator.tpu.TPUEstimator(
...
embedding_config_spec=tf.estimator.tpu.experimental.EmbeddingConfigSpec(
...
optimization_parameters=tf.tpu.experimental.FtrlParameters(0.1),
...))
Args | |
---|---|
learning_rate
|
a floating point value. The learning rate. |
learning_rate_power
|
A float value, must be less or equal to zero. Controls how the learning rate decreases during training. Use zero for a fixed learning rate. See section 3.1 in the paper. |
initial_accumulator_value
|
The starting value for accumulators. Only zero or positive values are allowed. |
l1_regularization_strength
|
A float value, must be greater than or equal to zero. |
l2_regularization_strength
|
A float value, must be greater than or equal to zero. |
use_gradient_accumulation
|
setting this to False makes embedding
gradients calculation less accurate but faster. Please see
optimization_parameters.proto for details.
for details.
|
clip_weight_min
|
the minimum value to clip by; None means -infinity. |
clip_weight_max
|
the maximum value to clip by; None means +infinity. |
weight_decay_factor
|
amount of weight decay to apply; None means that the weights are not decayed. |
multiply_weight_decay_factor_by_learning_rate
|
if true,
weight_decay_factor is multiplied by the current learning rate.
|