Configuration for Adafactor optimizer.

Inherits From: BaseOptimizerConfig, Config, ParamsDict

The attributes for this class matches the arguments of the Adafactor implementation.


default_params Dataclass field
restrictions Dataclass field
clipnorm Dataclass field
clipvalue Dataclass field
global_clipnorm Dataclass field
name Dataclass field
factored Dataclass field
multiply_by_parameter_scale Dataclass field
beta1 Dataclass field
decay_rate Dataclass field
step_offset Dataclass field
clipping_threshold Dataclass field
min_dim_size_to_factor Dataclass field
epsilon1 Dataclass field
epsilon2 Dataclass field
weight_decay Dataclass field
include_in_weight_decay Dataclass field



View source

Returns a dict representation of params_dict.ParamsDict.

For the nested params_dict.ParamsDict, a nested dict will be returned.


View source

Builds a config from the given list of arguments.


View source

Wrapper for from_yaml.


View source


View source

Accesses through built-in dictionary get method.


View source

Makes the ParamsDict immutable.


View source

Override the ParamsDict with a set of given params.

override_params a dict or a ParamsDict specifying the parameters to be overridden.
is_strict a boolean specifying whether override is strict or not. If True, keys in override_params must be present in the ParamsDict. If False, keys in override_params can be different from what is currently defined in the ParamsDict. In this case, the ParamsDict will be extended to include the new keys.


View source

Overrides/returns a unlocked copy with the current config unchanged.


View source

Validate the parameters consistency based on the restrictions.

This method validates the internal consistency using the pre-defined list of restrictions. A restriction is defined as a string which specifies a binary operation. The supported binary operations are {'==', '!=', '<', '<=', '>', '>='}. Note that the meaning of these operators are consistent with the underlying Python immplementation. Users should make sure the define restrictions on their type make sense.

For example, for a ParamsDict like the following

  a1: 1
  a2: 2
    bb1: 10
    bb2: 20
    a1: 1
    a3: 3

one can define two restrictions like this ['a.a1 == b.ccc.a1', 'a.a2 <=']

What it enforces are

  • a.a1 = 1 == b.ccc.a1 = 1
  • a.a2 = 2 <= = 20

KeyError if any of the following happens (1) any of parameters in any of restrictions is not defined in ParamsDict, (2) any inconsistency violating the restriction is found.
ValueError if the restriction defined in the string is not supported.


View source

Implements the membership test operator.


IMMUTABLE_TYPES (<class 'str'>, <class 'int'>, <class 'float'>, <class 'bool'>, <class 'NoneType'>)
RESERVED_ATTR ['_locked', '_restrictions']
SEQUENCE_TYPES (<class 'list'>, <class 'tuple'>)
beta1 None
clipnorm None
clipping_threshold 1.0
clipvalue None
decay_rate 0.8
default_params None
epsilon1 1e-30
epsilon2 0.001
factored True
global_clipnorm None
include_in_weight_decay None
min_dim_size_to_factor 128
multiply_by_parameter_scale True
name 'Adafactor'
restrictions None
step_offset 0
weight_decay None