How Tight Can PAC-Bayes be in the Small Data Regime?
- URL: http://arxiv.org/abs/2106.03542v1
- Date: Mon, 7 Jun 2021 12:11:32 GMT
- Title: How Tight Can PAC-Bayes be in the Small Data Regime?
- Authors: Andrew Y. K. Foong, Wessel P. Bruinsma, David R. Burt, Richard E.
Turner
- Abstract summary: PAC-Bayes and test set bounds can be made for small datasets.
We show that PAC-Bayes bounds are surprisingly competitive with comparable, commonly used Chernoff test set bounds.
The sharpest test set bounds still lead to better guarantees on the generalisation error than the PAC-Bayes bounds we consider.
- Score: 39.15172162668061
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the question: Given a small number of
datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be
made? For such small datasets, test set bounds adversely affect generalisation
performance by discarding data. In this setting, PAC-Bayes bounds are
especially attractive, due to their ability to use all the data to
simultaneously learn a posterior and bound its generalisation risk. We focus on
the case of i.i.d. data with a bounded loss and consider the generic PAC-Bayes
theorem of Germain et al. (2009) and Begin et al. (2016). While their theorem
is known to recover many existing PAC-Bayes bounds, it is unclear what the
tightest bound derivable from their framework is. Surprisingly, we show that
for a fixed learning algorithm and dataset, the tightest bound of this form
coincides with the tightest bound of the more restrictive family of bounds
considered in Catoni (2007). In contrast, in the more natural case of
distributions over datasets, we give examples (both analytic and numerical)
showing that the family of bounds in Catoni (2007) can be suboptimal. Within
the proof framework of Germain et al. (2009) and Begin et al. (2016), we
establish a lower bound on the best bound achievable in expectation, which
recovers the Chernoff test set bound in the case when the posterior is equal to
the prior. Finally, to illustrate how tight these bounds can potentially be, we
study a synthetic one-dimensional classification task in which it is feasible
to meta-learn both the prior and the form of the bound to obtain the tightest
PAC-Bayes and test set bounds possible. We find that in this simple, controlled
scenario, PAC-Bayes bounds are surprisingly competitive with comparable,
commonly used Chernoff test set bounds. However, the sharpest test set bounds
still lead to better guarantees on the generalisation error than the PAC-Bayes
bounds we consider.
Related papers
- Better-than-KL PAC-Bayes Bounds [23.87003743389573]
We show that it is possible to achieve a strictly tighter bound with a novel and better-than-KL divergence.
Our result is first-of-its-kind in that existing PAC-Bayes bounds with non-KL divergences are not known to be strictly better than KL.
arXiv Detail & Related papers (2024-02-14T14:33:39Z) - PAC-Bayes-Chernoff bounds for unbounded losses [9.987130158432755]
We introduce a new PAC-Bayes oracle bound for unbounded losses that extends Cram'er-Chernoff bounds to the PAC-Bayesian setting.
Our approach naturally leverages properties of Cram'er-Chernoff bounds, such as exact optimization of the free parameter in many PAC-Bayes bounds.
arXiv Detail & Related papers (2024-01-02T10:58:54Z) - Closing the Gap Between the Upper Bound and the Lower Bound of Adam's
Iteration Complexity [51.96093077151991]
We derive a new convergence guarantee of Adam, with only an $L$-smooth condition and a bounded noise variance assumption.
Our proof utilizes novel techniques to handle the entanglement between momentum and adaptive learning rate.
arXiv Detail & Related papers (2023-10-27T09:16:58Z) - PAC Prediction Sets Under Label Shift [52.30074177997787]
Prediction sets capture uncertainty by predicting sets of labels rather than individual labels.
We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.
We evaluate our approach on five datasets.
arXiv Detail & Related papers (2023-10-19T17:57:57Z) - A unified recipe for deriving (time-uniform) PAC-Bayes bounds [31.921092049934654]
We present a unified framework for deriving PAC-Bayesian generalization bounds.
Our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times.
arXiv Detail & Related papers (2023-02-07T12:11:59Z) - Lower Bounds for Learning in Revealing POMDPs [88.23337313766355]
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging emphpartially observable setting.
For emphmulti-step revealing POMDPs, we show that the latent state-space dependence is at least $Omega(S1.5)$ in the sample complexity.
arXiv Detail & Related papers (2023-02-02T18:59:30Z) - Geometry of Banach spaces: a new route towards Position Based
Cryptography [65.51757376525798]
We study Position Based Quantum Cryptography (PBQC) from the perspective of geometric functional analysis and its connections with quantum games.
The main question we are interested in asks for the optimal amount of entanglement that a coalition of attackers have to share in order to compromise the security of any PBQC protocol.
We show that the understanding of the type properties of some more involved Banach spaces would allow to drop out the assumptions and lead to unconditional lower bounds on the resources used to attack our protocol.
arXiv Detail & Related papers (2021-03-30T13:55:11Z) - A Limitation of the PAC-Bayes Framework [32.24251308425503]
We present a limitation for the PAC-Bayes framework.
We demonstrate an easy learning task that is not amenable to a PAC-Bayes analysis.
arXiv Detail & Related papers (2020-06-24T06:36:00Z) - PAC-Bayes Analysis Beyond the Usual Bounds [16.76187007910588]
We focus on a learning model where the learner observes a finite set of training examples.
The learned data-dependent distribution is then used to make randomized predictions.
arXiv Detail & Related papers (2020-06-23T14:30:24Z) - Best Arm Identification for Cascading Bandits in the Fixed Confidence
Setting [81.70513857417106]
We design and analyze CascadeBAI, an algorithm for finding the best set of $K$ items.
An upper bound on the time complexity of CascadeBAI is derived by overcoming a crucial analytical challenge.
We show, through the derivation of a lower bound on the time complexity, that the performance of CascadeBAI is optimal in some practical regimes.
arXiv Detail & Related papers (2020-01-23T16:47:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.