PAC-Bayes-Chernoff bounds for unbounded losses
- URL: http://arxiv.org/abs/2401.01148v4
- Date: Wed, 30 Oct 2024 11:49:21 GMT
- Title: PAC-Bayes-Chernoff bounds for unbounded losses
- Authors: Ioar Casado, Luis A. Ortega, Aritz Pérez, Andrés R. Masegosa,
- Abstract summary: We introduce a new PAC-Bayes oracle bound for unbounded losses that extends Cram'er-Chernoff bounds to the PAC-Bayesian setting.
Our approach naturally leverages properties of Cram'er-Chernoff bounds, such as exact optimization of the free parameter in many PAC-Bayes bounds.
- Score: 9.987130158432755
- License:
- Abstract: We introduce a new PAC-Bayes oracle bound for unbounded losses that extends Cram\'er-Chernoff bounds to the PAC-Bayesian setting. The proof technique relies on controlling the tails of certain random variables involving the Cram\'er transform of the loss. Our approach naturally leverages properties of Cram\'er-Chernoff bounds, such as exact optimization of the free parameter in many PAC-Bayes bounds. We highlight several applications of the main theorem. Firstly, we show that our bound recovers and generalizes previous results. Additionally, our approach allows working with richer assumptions that result in more informative and potentially tighter bounds. In this direction, we provide a general bound under a new \textit{model-dependent} assumption from which we obtain bounds based on parameter norms and log-Sobolev inequalities. Notably, many of these bounds can be minimized to obtain distributions beyond the Gibbs posterior and provide novel theoretical coverage to existing regularization techniques.
Related papers
- Tighter Generalisation Bounds via Interpolation [16.74864438507713]
We present a recipe for deriving new PAC-Bayes generalisation bounds based on the $(f, Gamma)$-divergence.
We also present PAC-Bayes generalisation bounds where we interpolate between a series of probability divergences.
arXiv Detail & Related papers (2024-02-07T18:55:22Z) - More PAC-Bayes bounds: From bounded losses, to losses with general tail behaviors, to anytime validity [27.87324770020133]
We present new high-probability PAC-Bayes bounds for different types of losses.
For losses with a bounded range, we recover a strengthened version of Catoni's bound that holds uniformly for all parameter values.
For losses with more general tail behaviors, we introduce two new parameter-free bounds.
arXiv Detail & Related papers (2023-06-21T12:13:46Z) - A unified recipe for deriving (time-uniform) PAC-Bayes bounds [31.921092049934654]
We present a unified framework for deriving PAC-Bayesian generalization bounds.
Our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times.
arXiv Detail & Related papers (2023-02-07T12:11:59Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - Bayesian decision-making under misspecified priors with applications to
meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits.
We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z) - Tighter expected generalization error bounds via Wasserstein distance [23.52237892358981]
We introduce several expected generalization error bounds based on the Wasserstein distance.
We present full-dataset, single-letter, and random-subset bounds on both the standard setting and the randomized-subsample setting.
We show how various new bounds based on different information measures can be derived from the presented bounds.
arXiv Detail & Related papers (2021-01-22T20:13:59Z) - On Lower Bounds for Standard and Robust Gaussian Process Bandit
Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm.
We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z) - Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity.
We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z) - Free Energy Wells and Overlap Gap Property in Sparse PCA [81.64027805404483]
We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime.
We show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem.
We prove that the Overlap Gap Property (OGP) holds in a significant part of the hard regime.
arXiv Detail & Related papers (2020-06-18T17:18:02Z) - PAC-Bayes unleashed: generalisation bounds with unbounded losses [12.078257783674923]
We present new PAC-Bayesian generalisation bounds for learning problems with unbounded loss functions.
This extends the relevance and applicability of the PAC-Bayes learning framework.
arXiv Detail & Related papers (2020-06-12T15:55:46Z) - Lower bounds in multiple testing: A framework based on derandomized
proxies [107.69746750639584]
This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
arXiv Detail & Related papers (2020-05-07T19:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.