Restarts subject to approximate sharpness: A parameter-free and optimal scheme for first-order methods
- URL: http://arxiv.org/abs/2301.02268v2
- Date: Mon, 22 Jul 2024 19:29:18 GMT
- Title: Restarts subject to approximate sharpness: A parameter-free and optimal scheme for first-order methods
- Authors: Ben Adcock, Matthew J. Colbrook, Maksym Neyra-Nesterenko,
- Abstract summary: Sharpness is an assumption in continuous optimization that bounds the distance from minima by objective function suboptimality.
Sharpness involves problem-specific constants that are typically unknown, and restart schemes typically reduce convergence rates.
We consider the assumption of approximate sharpness, a generalization of sharpness that incorporates an unknown constant perturbation to the objective function error.
- Score: 0.6554326244334866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sharpness is an almost generic assumption in continuous optimization that bounds the distance from minima by objective function suboptimality. It facilitates the acceleration of first-order methods through restarts. However, sharpness involves problem-specific constants that are typically unknown, and restart schemes typically reduce convergence rates. Moreover, these schemes are challenging to apply in the presence of noise or with approximate model classes (e.g., in compressive imaging or learning problems), and they generally assume that the first-order method used produces feasible iterates. We consider the assumption of approximate sharpness, a generalization of sharpness that incorporates an unknown constant perturbation to the objective function error. This constant offers greater robustness (e.g., with respect to noise or relaxation of model classes) for finding approximate minimizers. By employing a new type of search over the unknown constants, we design a restart scheme that applies to general first-order methods and does not require the first-order method to produce feasible iterates. Our scheme maintains the same convergence rate as when the constants are known. The convergence rates we achieve for various first-order methods match the optimal rates or improve on previously established rates for a wide range of problems. We showcase our restart scheme in several examples and highlight potential future applications and developments of our framework and theory.
Related papers
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Last Iterate Convergence of Incremental Methods and Applications in Continual Learning [10.811735264028348]
Motivated by applications in continual learning, we obtain convergence guarantees for the last iterate of both incremental gradient and incremental proximal methods.
We study incremental proximal methods as a model of continual learning with generalization and argue that large amount of regularization is crucial to preventing catastrophic forgetting.
arXiv Detail & Related papers (2024-03-11T16:24:26Z) - Discretize Relaxed Solution of Spectral Clustering via a Non-Heuristic
Algorithm [77.53604156112144]
We develop a first-order term to bridge the original problem and discretization algorithm.
Since the non-heuristic method is aware of the original graph cut problem, the final discrete solution is more reliable.
arXiv Detail & Related papers (2023-10-19T13:57:38Z) - Strictly Low Rank Constraint Optimization -- An Asymptotically
$\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution.
We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - SCORE: Approximating Curvature Information under Self-Concordant
Regularization [0.0]
We propose a generalized Gauss-Newton with Self-Concordant Regularization (GGN-SCORE) algorithm that updates the minimization speed each time it receives a new input.
The proposed algorithm exploits the structure of the second-order information in the Hessian matrix, thereby reducing computational overhead.
arXiv Detail & Related papers (2021-12-14T13:03:04Z) - WARPd: A linearly convergent first-order method for inverse problems
with approximate sharpness conditions [0.0]
Sharpness conditions directly control the recovery performance of restart schemes for first-order methods.
We provide a first-order method: Weighted, Accelerated and Restarted Primal-dual (WARPd)
Under a generic approximate sharpness condition, WARPd achieves stable linear convergence to the desired vector.
We show how WARPd compares favorably with specialized state-of-the-art methods and is ideally suited for solving large-scale problems.
arXiv Detail & Related papers (2021-10-24T13:19:41Z) - On Constraints in First-Order Optimization: A View from Non-Smooth
Dynamical Systems [99.59934203759754]
We introduce a class of first-order methods for smooth constrained optimization.
Two distinctive features of our approach are that projections or optimizations over the entire feasible set are avoided.
The resulting algorithmic procedure is simple to implement even when constraints are nonlinear.
arXiv Detail & Related papers (2021-07-17T11:45:13Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Regret minimization in stochastic non-convex learning via a
proximal-gradient approach [80.59047515124198]
Motivated by applications in machine learning and operations, we regret with first-order oracle feedback minimization online constrained problems.
We develop a new prox-grad with guarantees proximal complexity reduction techniques.
arXiv Detail & Related papers (2020-10-13T09:22:21Z) - DeepInit Phase Retrieval [10.385009647156407]
This paper shows how data deep generative models can be utilized to solve challenging phase retrieval problems.
It shows that our hybrid approach is able to deliver very high reconstruction results at low sampling rates.
arXiv Detail & Related papers (2020-07-16T09:39:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.