Robust PAC$^m$: Training Ensemble Models Under Misspecification and
Outliers
- URL: http://arxiv.org/abs/2203.01859v3
- Date: Sun, 23 Apr 2023 15:12:44 GMT
- Title: Robust PAC$^m$: Training Ensemble Models Under Misspecification and
Outliers
- Authors: Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Marios Kountouris,
David Gesbert
- Abstract summary: PAC-Bayes theory demonstrates that the free energy criterion minimized by Bayesian learning is a bound on the generalization error for Gibbs predictors.
This work presents a novel robust free energy criterion that combines the generalized score function with PAC$m$ ensemble bounds.
- Score: 46.38465729190199
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Standard Bayesian learning is known to have suboptimal generalization
capabilities under misspecification and in the presence of outliers. PAC-Bayes
theory demonstrates that the free energy criterion minimized by Bayesian
learning is a bound on the generalization error for Gibbs predictors (i.e., for
single models drawn at random from the posterior) under the assumption of
sampling distributions uncontaminated by outliers. This viewpoint provides a
justification for the limitations of Bayesian learning when the model is
misspecified, requiring ensembling, and when data is affected by outliers. In
recent work, PAC-Bayes bounds -- referred to as PAC$^m$ -- were derived to
introduce free energy metrics that account for the performance of ensemble
predictors, obtaining enhanced performance under misspecification. This work
presents a novel robust free energy criterion that combines the generalized
logarithm score function with PAC$^m$ ensemble bounds. The proposed free energy
training criterion produces predictive distributions that are able to
concurrently counteract the detrimental effects of misspecification -- with
respect to both likelihood and prior distribution -- and outliers.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Bayesian vs. PAC-Bayesian Deep Neural Network Ensembles [7.883369697332076]
We argue that neither the sampling nor the weighting in a Bayes ensemble are particularly well-suited for increasing generalization performance.
We show that state-of-the-art Bayes ensembles from the literature, despite being computationally demanding, do not improve over simple uniformly weighted deep ensembles.
arXiv Detail & Related papers (2024-06-08T13:19:18Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Model-Based Uncertainty in Value Functions [89.31922008981735]
We focus on characterizing the variance over values induced by a distribution over MDPs.
Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation.
We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values.
arXiv Detail & Related papers (2023-02-24T09:18:27Z) - Sequential prediction under log-loss and misspecification [47.66467420098395]
We consider the question of sequential prediction under the log-loss in terms of cumulative regret.
We show that cumulative regrets in the well-specified and misspecified cases coincideally.
We provide an $o(1)$ characterization of the distribution-free or PAC regret.
arXiv Detail & Related papers (2021-01-29T20:28:23Z) - PAC-Bayes Analysis Beyond the Usual Bounds [16.76187007910588]
We focus on a learning model where the learner observes a finite set of training examples.
The learned data-dependent distribution is then used to make randomized predictions.
arXiv Detail & Related papers (2020-06-23T14:30:24Z) - De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and
Non-smooth Predictors [21.59277717031637]
We present a family of de-randomized PACes for deterministic non-smooth predictors, e.g., ReLU-nets.
We also present empirical results of our bounds over changing set size and in labels.
arXiv Detail & Related papers (2020-02-23T17:54:07Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.