Anderson Acceleration in Nonsmooth Problems: Local Convergence via Active Manifold Identification
- URL: http://arxiv.org/abs/2410.09420v2
- Date: Tue, 15 Oct 2024 09:16:28 GMT
- Title: Anderson Acceleration in Nonsmooth Problems: Local Convergence via Active Manifold Identification
- Authors: Kexin Li, Luwei Bai, Xiao Wang, Hao Wang,
- Abstract summary: We investigate a class of nonsmooth optimization algorithms characterized by the active manifold identification property.
Under the assumption that the optimization problem possesses an active manifold at a stationary point, we establish a local R-linear convergence rate for the Anderson-accelerated algorithm.
- Score: 11.495346367359037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anderson acceleration is an effective technique for enhancing the efficiency of fixed-point iterations; however, analyzing its convergence in nonsmooth settings presents significant challenges. In this paper, we investigate a class of nonsmooth optimization algorithms characterized by the active manifold identification property. This class includes a diverse array of methods such as the proximal point method, proximal gradient method, proximal linear method, proximal coordinate descent method, Douglas-Rachford splitting (or the alternating direction method of multipliers), and the iteratively reweighted $\ell_1$ method, among others. Under the assumption that the optimization problem possesses an active manifold at a stationary point, we establish a local R-linear convergence rate for the Anderson-accelerated algorithm. Our extensive numerical experiments further highlight the robust performance of the proposed Anderson-accelerated methods.
Related papers
- Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization [0.0]
We propose a novel algorithm that extends the methods of ball smoothing and Gaussian smoothing for noisy derivative-free optimization.
The algorithm dynamically adapts the shape of the smoothing kernel to approximate the Hessian of the objective function around a local optimum.
arXiv Detail & Related papers (2024-05-02T21:04:20Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Incremental Quasi-Newton Methods with Faster Superlinear Convergence
Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian.
The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate.
This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z) - ELRA: Exponential learning rate adaption gradient descent optimization
method [83.88591755871734]
We present a novel, fast (exponential rate), ab initio (hyper-free) gradient based adaption.
The main idea of the method is to adapt the $alpha by situational awareness.
It can be applied to problems of any dimensions n and scales only linearly.
arXiv Detail & Related papers (2023-09-12T14:36:13Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Nystrom Method for Accurate and Scalable Implicit Differentiation [25.29277451838466]
We show that the Nystrom method consistently achieves comparable or even superior performance to other approaches.
The proposed method avoids numerical instability and can be efficiently computed in matrix operations without iterations.
arXiv Detail & Related papers (2023-02-20T02:37:26Z) - The Curse of Unrolling: Rate of Differentiating Through Optimization [35.327233435055305]
Un differentiation approximates the solution using an iterative solver and differentiates it through the computational path.
We show that we can either 1) choose a large learning rate leading to a fast convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence.
arXiv Detail & Related papers (2022-09-27T09:27:29Z) - Subgradient methods near active manifolds: saddle point avoidance, local
convergence, and asymptotic normality [4.709588811674973]
We show that aiming and subgradient approximation fully expose the smooth substructure of the problem.
We prove these properties hold for a wide class of problems, including cone reducible/decomposable functions and generic semialgebraic problems.
The normality results appear to be new even in the most classical setting.
arXiv Detail & Related papers (2021-08-26T15:02:16Z) - Converting ADMM to a Proximal Gradient for Convex Optimization Problems [4.56877715768796]
In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem.
This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex.
We show by numerical experiments that we can obtain a significant improvement in terms of efficiency.
arXiv Detail & Related papers (2021-04-22T07:41:12Z) - IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.
Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method.
When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z) - Optimization of Graph Total Variation via Active-Set-based Combinatorial
Reconditioning [48.42916680063503]
We propose a novel adaptive preconditioning strategy for proximal algorithms on this problem class.
We show that nested-forest decomposition of the inactive edges yields a guaranteed local linear convergence rate.
Our results suggest that local convergence analysis can serve as a guideline for selecting variable metrics in proximal algorithms.
arXiv Detail & Related papers (2020-02-27T16:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.