Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization
Bounds with Complexity Measures
- URL: http://arxiv.org/abs/2402.13285v1
- Date: Mon, 19 Feb 2024 10:15:11 GMT
- Title: Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization
Bounds with Complexity Measures
- Authors: Paul Viallard, R\'emi Emonet, Amaury Habrard, Emilie Morvant,
Valentina Zantedeschi
- Abstract summary: We leverage the framework of disintegrated PAC-Bayes bounds to derive a general generalization bound instantiable with arbitrary complexity measures.
Our bound stands in probability jointly over the hypothesis and the learning sample, which allows the complexity to be adapted to the generalization gap.
- Score: 14.408127155170774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In statistical learning theory, a generalization bound usually involves a
complexity measure imposed by the considered theoretical framework. This limits
the scope of such bounds, as other forms of capacity measures or
regularizations are used in algorithms. In this paper, we leverage the
framework of disintegrated PAC-Bayes bounds to derive a general generalization
bound instantiable with arbitrary complexity measures. One trick to prove such
a result involves considering a commonly used family of distributions: the
Gibbs distributions. Our bound stands in probability jointly over the
hypothesis and the learning sample, which allows the complexity to be adapted
to the generalization gap as it can be customized to fit both the hypothesis
class and the task.
Related papers
- Class-wise Generalization Error: an Information-Theoretic Analysis [22.877440350595222]
We study the class-generalization error, which quantifies the generalization performance of each individual class.
We empirically validate our proposed bounds in different neural networks and show that they accurately capture the complex class-generalization error behavior.
arXiv Detail & Related papers (2024-01-05T17:05:14Z) - Generalization Guarantees via Algorithm-dependent Rademacher Complexity [33.408679482592426]
We propose a measure to control generalization error, which is the empirical Rademacher complexity of an algorithm- and data-dependent hypothesis class.
We obtain novel bounds based on the finite fractal dimension, which (a) extend previous fractal-type bounds from continuous to finite hypothesis classes, and (b) avoid a mutual information term.
arXiv Detail & Related papers (2023-07-04T18:37:38Z) - A unified framework for information-theoretic generalization bounds [8.04975023021212]
This paper presents a general methodology for deriving information-theoretic generalization bounds for learning algorithms.
The main technical tool is a probabilistic decorrelation lemma based on a change of measure and a relaxation of Young's inequality in $L_psi_p$ Orlicz spaces.
arXiv Detail & Related papers (2023-05-18T15:36:20Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - Chained Generalisation Bounds [26.043342234937747]
We derive upper bounds for the expected generalisation error of supervised learning algorithms by means of the chaining technique.
We establish a duality between generalisation bounds based on the regularity of the loss function, and their chained counterparts.
arXiv Detail & Related papers (2022-03-02T09:34:36Z) - An Online Learning Approach to Interpolation and Extrapolation in Domain
Generalization [53.592597682854944]
We recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test.
We show that ERM is provably minimax-optimal for both tasks.
arXiv Detail & Related papers (2021-02-25T19:06:48Z) - Dimension Free Generalization Bounds for Non Linear Metric Learning [61.193693608166114]
We provide uniform generalization bounds for two regimes -- the sparse regime, and a non-sparse regime.
We show that by relying on a different, new property of the solutions, it is still possible to provide dimension free generalization guarantees.
arXiv Detail & Related papers (2021-02-07T14:47:00Z) - In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, most of these bounds are numerically vacuous.
We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z) - Generalized Sliced Distances for Probability Distributions [47.543990188697734]
We introduce a broad family of probability metrics, coined as Generalized Sliced Probability Metrics (GSPMs)
GSPMs are rooted in the generalized Radon transform and come with a unique geometric interpretation.
We consider GSPM-based gradient flows for generative modeling applications and show that under mild assumptions, the gradient flow converges to the global optimum.
arXiv Detail & Related papers (2020-02-28T04:18:00Z) - Generalised Lipschitz Regularisation Equals Distributional Robustness [47.44261811369141]
We give a very general equality result regarding the relationship between distributional robustness and regularisation.
We show a new result explicating the connection between adversarial learning and distributional robustness.
arXiv Detail & Related papers (2020-02-11T04:19:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.