``````public class AdaMax<Model: Differentiable & KeyPathIterable>: Optimizer
where
Model.TangentVector: VectorProtocol & PointwiseMultiplicative & ElementaryFunctions
& KeyPathIterable,
Model.TangentVector.VectorSpaceScalar == Float``````

A variant of Adam based on the infinity-norm.

Reference: Section 7 of “Adam - A Method for Stochastic Optimization”

• ``` Model ```

#### Declaration

``public typealias Model = Model``
• ``` learningRate ```

The learning rate.

#### Declaration

``public var learningRate: Float``
• ``` beta1 ```

Decay rate used to estimate the first moment (mean) of gradients.

#### Declaration

``public var beta1: Float``
• ``` beta2 ```

Decay rate used to estimate the exponentially weighted infinity norm.

#### Declaration

``public var beta2: Float``
• ``` epsilon ```

A small scalar added to the denominator to improve numerical stability.

#### Declaration

``public var epsilon: Float``
• ``` decay ```

The learning rate decay.

#### Declaration

``public var decay: Float``
• ``` step ```

The step count.

#### Declaration

``public var step: Int``
• ``` firstMoments ```

The first moments of the weights.

#### Declaration

``public var firstMoments: Model.TangentVector``
• ``` infinityNorm ```

The exponentially weighted infinity norm of the weights.

#### Declaration

``public var infinityNorm: Model.TangentVector``
• ``` init(for:learningRate:beta1:beta2:epsilon:decay:) ```

Note: The default parameters follow those provided in the paper.

#### Declaration

``````public init(
for model: __shared Model,
learningRate: Float = 0.002,
beta1: Float = 0.9,
beta2: Float = 0.999,
epsilon: Float = 1e-8,
decay: Float = 0
)``````
• ``` update(_:along:) ```

#### Declaration

``public func update(_ model: inout Model, along direction: Model.TangentVector)``
• ``` init(copying:to:) ```

#### Declaration

``public required init(copying other: AdaMax, to device: Device)``