PAC-Bayesian Learning of Optimization Algorithms
- URL: http://arxiv.org/abs/2210.11113v1
- Date: Thu, 20 Oct 2022 09:16:36 GMT
- Title: PAC-Bayesian Learning of Optimization Algorithms
- Authors: Michael Sucker and Peter Ochs
- Abstract summary: We apply the PAC-Bayes theory to the setting of learning-to-optimize.
We learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed.
Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families.
- Score: 6.624726878647541
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We apply the PAC-Bayes theory to the setting of learning-to-optimize. To the
best of our knowledge, we present the first framework to learn optimization
algorithms with provable generalization guarantees (PAC-bounds) and explicit
trade-off between a high probability of convergence and a high convergence
speed. Even in the limit case, where convergence is guaranteed, our learned
optimization algorithms provably outperform related algorithms based on a
(deterministic) worst-case analysis. Our results rely on PAC-Bayes bounds for
general, unbounded loss-functions based on exponential families. By
generalizing existing ideas, we reformulate the learning procedure into a
one-dimensional minimization problem and study the possibility to find a global
minimum, which enables the algorithmic realization of the learning procedure.
As a proof-of-concept, we learn hyperparameters of standard optimization
algorithms to empirically underline our theory.
Related papers
- Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization.
We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate.
We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z) - Learning via Surrogate PAC-Bayes [13.412960492870996]
PAC-Bayes learning is a comprehensive setting for studying the generalisation ability of learning algorithms.
We introduce a novel principled strategy for building an iterative learning algorithm via the optimisation of a sequence of surrogate training objectives.
On top of providing that generic recipe for learning via surrogate PAC-Bayes bounds, we (i) contribute theoretical results establishing that iteratively optimising our surrogates implies the optimisation of the original generalisation bounds.
arXiv Detail & Related papers (2024-10-14T07:45:50Z) - A Generalization Result for Convergence in Learning-to-Optimize [4.112909937203119]
Conventional convergence guarantees in optimization are based on geometric arguments, which cannot be applied to algorithms.
We are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our knowledge, we are the first to prove the best of our
arXiv Detail & Related papers (2024-10-10T08:17:04Z) - Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation [4.239829789304117]
We use the PAC-Bayesian theory for the setting of learning-to-optimize.
We present the first framework to learn optimization algorithms with provable generalization guarantees.
Our learned algorithms provably outperform related ones derived from a (deterministic) worst-case analysis.
arXiv Detail & Related papers (2024-04-04T08:24:57Z) - Bayesian Design Principles for Frequentist Sequential Learning [11.421942894219901]
We develop a theory to optimize the frequentist regret for sequential learning problems.
We propose a novel optimization approach to generate "algorithmic beliefs" at each round.
We present a novel algorithm for multi-armed bandits that achieves the "best-of-all-worlds" empirical performance.
arXiv Detail & Related papers (2023-10-01T22:17:37Z) - Multivariate Systemic Risk Measures and Computation by Deep Learning
Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations.
The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z) - Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z) - B\'ezier Flow: a Surface-wise Gradient Descent Method for
Multi-objective Optimization [12.487037582320804]
We extend the stability of optimization algorithms in the sense of Probability Approximately Correct (PAC) learning.
We show that multi-objective optimization algorithms derived from a gradient descent-based single-objective optimization algorithm are PAC stable.
arXiv Detail & Related papers (2022-05-23T07:47:58Z) - Uniform-PAC Bounds for Reinforcement Learning with Linear Function
Approximation [92.3161051419884]
We study reinforcement learning with linear function approximation.
Existing algorithms only have high-probability regret and/or Probably Approximately Correct (PAC) sample complexity guarantees.
We propose a new algorithm called FLUTE, which enjoys uniform-PAC convergence to the optimal policy with high probability.
arXiv Detail & Related papers (2021-06-22T08:48:56Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees [77.67258935234403]
We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning.
We develop a class of PAC-optimal meta-learning algorithms with performance guarantees and a principled meta-level regularization.
arXiv Detail & Related papers (2020-02-13T15:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.