A unified recipe for deriving (time-uniform) PAC-Bayes bounds
- URL: http://arxiv.org/abs/2302.03421v5
- Date: Wed, 3 Jan 2024 18:32:00 GMT
- Title: A unified recipe for deriving (time-uniform) PAC-Bayes bounds
- Authors: Ben Chugg, Hongjian Wang, Aaditya Ramdas
- Abstract summary: We present a unified framework for deriving PAC-Bayesian generalization bounds.
Our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times.
- Score: 31.921092049934654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a unified framework for deriving PAC-Bayesian generalization
bounds. Unlike most previous literature on this topic, our bounds are
anytime-valid (i.e., time-uniform), meaning that they hold at all stopping
times, not only for a fixed sample size. Our approach combines four tools in
the following order: (a) nonnegative supermartingales or reverse
submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula
(or other convex duality principles), and (d) Ville's inequality. Our main
result is a PAC-Bayes theorem which holds for a wide class of discrete
stochastic processes. We show how this result implies time-uniform versions of
well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester,
Maurer, and Catoni, in addition to many recent bounds. We also present several
novel bounds. Our framework also enables us to relax traditional assumptions;
in particular, we consider nonstationary loss functions and non-i.i.d. data. In
sum, we unify the derivation of past bounds and ease the search for future
bounds: one may simply check if our supermartingale or submartingale conditions
are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.
Related papers
- Better-than-KL PAC-Bayes Bounds [23.87003743389573]
We show that it is possible to achieve a strictly tighter bound with a novel and better-than-KL divergence.
Our result is first-of-its-kind in that existing PAC-Bayes bounds with non-KL divergences are not known to be strictly better than KL.
arXiv Detail & Related papers (2024-02-14T14:33:39Z) - PAC-Bayes-Chernoff bounds for unbounded losses [9.987130158432755]
We introduce a new PAC-Bayes oracle bound for unbounded losses that extends Cram'er-Chernoff bounds to the PAC-Bayesian setting.
Our approach naturally leverages properties of Cram'er-Chernoff bounds, such as exact optimization of the free parameter in many PAC-Bayes bounds.
arXiv Detail & Related papers (2024-01-02T10:58:54Z) - Lower Bounds for Learning in Revealing POMDPs [88.23337313766355]
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging emphpartially observable setting.
For emphmulti-step revealing POMDPs, we show that the latent state-space dependence is at least $Omega(S1.5)$ in the sample complexity.
arXiv Detail & Related papers (2023-02-02T18:59:30Z) - Bayesian decision-making under misspecified priors with applications to
meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits.
We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z) - How Tight Can PAC-Bayes be in the Small Data Regime? [39.15172162668061]
PAC-Bayes and test set bounds can be made for small datasets.
We show that PAC-Bayes bounds are surprisingly competitive with comparable, commonly used Chernoff test set bounds.
The sharpest test set bounds still lead to better guarantees on the generalisation error than the PAC-Bayes bounds we consider.
arXiv Detail & Related papers (2021-06-07T12:11:32Z) - Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity.
We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z) - A Limitation of the PAC-Bayes Framework [32.24251308425503]
We present a limitation for the PAC-Bayes framework.
We demonstrate an easy learning task that is not amenable to a PAC-Bayes analysis.
arXiv Detail & Related papers (2020-06-24T06:36:00Z) - PAC-Bayes Analysis Beyond the Usual Bounds [16.76187007910588]
We focus on a learning model where the learner observes a finite set of training examples.
The learned data-dependent distribution is then used to make randomized predictions.
arXiv Detail & Related papers (2020-06-23T14:30:24Z) - Free Energy Wells and Overlap Gap Property in Sparse PCA [81.64027805404483]
We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime.
We show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem.
We prove that the Overlap Gap Property (OGP) holds in a significant part of the hard regime.
arXiv Detail & Related papers (2020-06-18T17:18:02Z) - Best Arm Identification for Cascading Bandits in the Fixed Confidence
Setting [81.70513857417106]
We design and analyze CascadeBAI, an algorithm for finding the best set of $K$ items.
An upper bound on the time complexity of CascadeBAI is derived by overcoming a crucial analytical challenge.
We show, through the derivation of a lower bound on the time complexity, that the performance of CascadeBAI is optimal in some practical regimes.
arXiv Detail & Related papers (2020-01-23T16:47:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.