Tighter Generalisation Bounds via Interpolation
- URL: http://arxiv.org/abs/2402.05101v1
- Date: Wed, 7 Feb 2024 18:55:22 GMT
- Title: Tighter Generalisation Bounds via Interpolation
- Authors: Paul Viallard, Maxime Haddouche, Umut \c{S}im\c{s}ekli, Benjamin Guedj
- Abstract summary: We present a recipe for deriving new PAC-Bayes generalisation bounds based on the $(f, Gamma)$-divergence.
We also present PAC-Bayes generalisation bounds where we interpolate between a series of probability divergences.
- Score: 16.74864438507713
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper contains a recipe for deriving new PAC-Bayes generalisation bounds
based on the $(f, \Gamma)$-divergence, and, in addition, presents PAC-Bayes
generalisation bounds where we interpolate between a series of probability
divergences (including but not limited to KL, Wasserstein, and total
variation), making the best out of many worlds depending on the posterior
distributions properties. We explore the tightness of these bounds and connect
them to earlier results from statistical learning, which are specific cases. We
also instantiate our bounds as training objectives, yielding non-trivial
guarantees and practical performances.
Related papers
- Information Capacity Regret Bounds for Bandits with Mediator Feedback [55.269551124587224]
We introduce the policy set capacity as an information-theoretic measure for the complexity of the policy set.
Adopting the classical EXP4 algorithm, we provide new regret bounds depending on the policy set capacity.
For a selection of policy set families, we prove nearly-matching lower bounds, scaling similarly with the capacity.
arXiv Detail & Related papers (2024-02-15T19:18:47Z) - PAC-Bayes-Chernoff bounds for unbounded losses [9.987130158432755]
We introduce a new PAC-Bayes oracle bound for unbounded losses that extends Cram'er-Chernoff bounds to the PAC-Bayesian setting.
Our approach naturally leverages properties of Cram'er-Chernoff bounds, such as exact optimization of the free parameter in many PAC-Bayes bounds.
arXiv Detail & Related papers (2024-01-02T10:58:54Z) - A unified framework for information-theoretic generalization bounds [8.04975023021212]
This paper presents a general methodology for deriving information-theoretic generalization bounds for learning algorithms.
The main technical tool is a probabilistic decorrelation lemma based on a change of measure and a relaxation of Young's inequality in $L_psi_p$ Orlicz spaces.
arXiv Detail & Related papers (2023-05-18T15:36:20Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - Integral Probability Metrics PAC-Bayes Bounds [22.534592918803813]
We present a PAC-Bayes-style generalization bound which enables the replacement of the KL-divergence with a variety of Integral Probability Metrics (IPM)
A notable feature of the obtained bounds is that they naturally interpolate between classical uniform convergence bounds in the worst case.
arXiv Detail & Related papers (2022-07-01T18:22:44Z) - Controlling Multiple Errors Simultaneously with a PAC-Bayes Bound [21.273964864852612]
We provide the first PAC-Bayes bound capable of providing rich information by bounding the Kullback-Leibler divergence between the empirical and true probabilities of a set of M error types.
Our bound is especially useful in cases where the severity of different mis-classifications may change over time.
arXiv Detail & Related papers (2022-02-11T11:35:21Z) - Bayesian decision-making under misspecified priors with applications to
meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits.
We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z) - Learning Stochastic Majority Votes by Minimizing a PAC-Bayes
Generalization Bound [15.557653926558638]
We investigate a counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties.
We instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk.
The resulting majority vote learning algorithm achieves state-of-the-art accuracy and benefits from (non-vacuous) tight bounds.
arXiv Detail & Related papers (2021-06-23T16:57:23Z) - An Online Learning Approach to Interpolation and Extrapolation in Domain
Generalization [53.592597682854944]
We recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test.
We show that ERM is provably minimax-optimal for both tasks.
arXiv Detail & Related papers (2021-02-25T19:06:48Z) - Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity.
We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.