tfp.optimizer.proximal_hessian_sparse_minimize

本页内容
Args
Returns

Minimize using Hessian-informed proximal gradient descent.

tfp.optimizer.proximal_hessian_sparse_minimize(
    grad_and_hessian_loss_fn,
    x_start,
    tolerance,
    l1_regularizer,
    l2_regularizer=None,
    maximum_iterations=1,
    maximum_full_sweeps_per_iteration=1,
    learning_rate=None,
    name=None
)

This function solves the regularized minimization problem

argmin{ Loss(x)
          + l1_regularizer * ||x||_1
          + l2_regularizer * ||x||_2**2
        : x in R^n }

where Loss is a convex C^2 function (typically, Loss is the negative log likelihood of a model and x is a vector of model coefficients). The Loss function does not need to be supplied directly, but this optimizer does need a way to compute the gradient and Hessian of the Loss function at a given value of x. The gradient and Hessian are often computationally expensive, and this optimizer calls them relatively few times compared with other algorithms.

Args
`grad_and_hessian_loss_fn`	callable that takes as input a (batch of) `Tensor` of the same shape and dtype as `x_start` and returns the triple `(gradient_unregularized_loss, hessian_unregularized_loss_outer, hessian_unregularized_loss_middle)` as defined in the argument spec of `minimize_one_step`.
`x_start`	(Batch of) vector-shaped, `float` `Tensor` representing the initial value of the argument to the `Loss` function.
`tolerance`	scalar, `float` `Tensor` representing the tolerance for each optimization step; see the `tolerance` argument of `minimize_one_step`.
`l1_regularizer`	scalar, `float` `Tensor` representing the weight of the L1 regularization term (see equation above).
`l2_regularizer`	scalar, `float` `Tensor` representing the weight of the L2 regularization term (see equation above). Default value: `None` (i.e., no L2 regularization).
`maximum_iterations`	Python integer specifying the maximum number of iterations of the outer loop of the optimizer. After this many iterations of the outer loop, the algorithm will terminate even if the return value `optimal_x` has not converged. Default value: `1`.
`maximum_full_sweeps_per_iteration`	Python integer specifying the maximum number of sweeps allowed in each iteration of the outer loop of the optimizer. Passed as the `maximum_full_sweeps` argument to `minimize_one_step`. Default value: `1`.
`learning_rate`	scalar, `float` `Tensor` representing a multiplicative factor used to dampen the proximal gradient descent steps. Default value: `None` (i.e., factor is conceptually `1`).
`name`	Python string representing the name of the TensorFlow operation. The default name is `"minimize"`.

Returns
`x`	`Tensor` of the same shape and dtype as `x_start`, representing the (batches of) computed values of `x` which minimizes `Loss(x)`.
`is_converged`	scalar, `bool` `Tensor` indicating whether the minimization procedure converged within the specified number of iterations across all batches. Here convergence means that an iteration of the inner loop (`minimize_one_step`) returns `True` for its `is_converged` output value.
`iter`	scalar, `int` `Tensor` indicating the actual number of iterations of the outer loop of the optimizer completed (i.e., number of calls to `minimize_one_step` before achieving convergence).

References

[1]: Jerome Friedman, Trevor Hastie and Rob Tibshirani. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 2010. https://www.jstatsoft.org/article/view/v033i01/v33i01.pdf

[2]: Guo-Xun Yuan, Chia-Hua Ho and Chih-Jen Lin. An Improved GLMNET for L1-regularized Logistic Regression. Journal of Machine Learning Research, 13, 2012. http://www.jmlr.org/papers/volume13/yuan12a/yuan12a.pdf

tfp.optimizer.proximal_hessian_sparse_minimize

Args

Returns

References