A Generalization Result for Convergence in Learning-to-Optimize
- URL: http://arxiv.org/abs/2410.07704v1
- Date: Thu, 10 Oct 2024 08:17:04 GMT
- Title: A Generalization Result for Convergence in Learning-to-Optimize
- Authors: Michael Sucker, Peter Ochs,
- Abstract summary: Conventional convergence guarantees in optimization are based on geometric arguments, which cannot be applied to algorithms.
We are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our
- Score: 4.112909937203119
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convergence in learning-to-optimize is hardly studied, because conventional convergence guarantees in optimization are based on geometric arguments, which cannot be applied easily to learned algorithms. Thus, we develop a probabilistic framework that resembles deterministic optimization and allows for transferring geometric arguments into learning-to-optimize. Our main theorem is a generalization result for parametric classes of potentially non-smooth, non-convex loss functions and establishes the convergence of learned optimization algorithms to stationary points with high probability. This can be seen as a statistical counterpart to the use of geometric safeguards to ensure convergence. To the best of our knowledge, we are the first to prove convergence of optimization algorithms in such a probabilistic framework.
Related papers
- Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation [4.239829789304117]
We use the PAC-Bayesian theory for the setting of learning-to-optimize.
We present the first framework to learn optimization algorithms with provable generalization guarantees.
Our learned algorithms provably outperform related ones derived from a (deterministic) worst-case analysis.
arXiv Detail & Related papers (2024-04-04T08:24:57Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Asymptotic convergence of iterative optimization algorithms [1.6328866317851185]
This paper introduces a general framework for iterative optimization algorithms.
We prove that under appropriate assumptions, the rate of convergence can be lower bounded.
We provide the exact convergence rate.
arXiv Detail & Related papers (2023-02-24T09:58:56Z) - Multivariate Systemic Risk Measures and Computation by Deep Learning
Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations.
The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z) - PAC-Bayesian Learning of Optimization Algorithms [6.624726878647541]
We apply the PAC-Bayes theory to the setting of learning-to-optimize.
We learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed.
Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families.
arXiv Detail & Related papers (2022-10-20T09:16:36Z) - Non-Convex Optimization with Certificates and Fast Rates Through Kernel
Sums of Squares [68.8204255655161]
We consider potentially non- optimized approximation problems.
In this paper, we propose an algorithm that achieves close to optimal a priori computational guarantees.
arXiv Detail & Related papers (2022-04-11T09:37:04Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - A Dynamical Systems Approach for Convergence of the Bayesian EM
Algorithm [59.99439951055238]
We show how (discrete-time) Lyapunov stability theory can serve as a powerful tool to aid, or even lead, in the analysis (and potential design) of optimization algorithms that are not necessarily gradient-based.
The particular ML problem that this paper focuses on is that of parameter estimation in an incomplete-data Bayesian framework via the popular optimization algorithm known as maximum a posteriori expectation-maximization (MAP-EM)
We show that fast convergence (linear or quadratic) is achieved, which could have been difficult to unveil without our adopted S&C approach.
arXiv Detail & Related papers (2020-06-23T01:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.