A Note on High-Probability versus In-Expectation Guarantees of
Generalization Bounds in Machine Learning
- URL: http://arxiv.org/abs/2010.02576v1
- Date: Tue, 6 Oct 2020 09:41:35 GMT
- Title: A Note on High-Probability versus In-Expectation Guarantees of
Generalization Bounds in Machine Learning
- Authors: Alexander Mey
- Abstract summary: Statistical machine learning theory often tries to give generalization guarantees of machine learning models.
Statements made about the performance of machine learning models have to take the sampling process into account.
We show how one may transform one statement to another.
- Score: 95.48744259567837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Statistical machine learning theory often tries to give generalization
guarantees of machine learning models. Those models naturally underlie some
fluctuation, as they are based on a data sample. If we were unlucky, and
gathered a sample that is not representative of the underlying distribution,
one cannot expect to construct a reliable machine learning model. Following
that, statements made about the performance of machine learning models have to
take the sampling process into account. The two common approaches for that are
to generate statements that hold either in high-probability, or in-expectation,
over the random sampling process. In this short note we show how one may
transform one statement to another. As a technical novelty we address the case
of unbounded loss function, where we use a fairly new assumption, called the
witness condition.
Related papers
- Estimating the Probabilities of Rare Outputs in Language Models [8.585890569162267]
We study low probability estimation in the context of argmax sampling from small transformer language models.
We find that importance sampling outperforms activation extrapolation, but both outperform naive sampling.
We argue that new methods for low probability estimation are needed to provide stronger guarantees about worst-case performance.
arXiv Detail & Related papers (2024-10-17T04:31:18Z) - Fair Generalized Linear Mixed Models [0.0]
Fairness in machine learning aims to ensure that biases in the data and model inaccuracies do not lead to discriminatory decisions.
We present an algorithm that can handle both problems simultaneously.
arXiv Detail & Related papers (2024-05-15T11:42:41Z) - SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning [49.94607673097326]
We propose a highly adaptable framework, designated as SimPro, which does not rely on any predefined assumptions about the distribution of unlabeled data.
Our framework, grounded in a probabilistic model, innovatively refines the expectation-maximization algorithm.
Our method showcases consistent state-of-the-art performance across diverse benchmarks and data distribution scenarios.
arXiv Detail & Related papers (2024-02-21T03:39:04Z) - User-defined Event Sampling and Uncertainty Quantification in Diffusion
Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems.
We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases.
We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z) - Missing Value Knockoffs [0.0]
A recently introduced framework, model-x knockoffs, provides that to a wide range of models but lacks support for datasets with missing values.
We show that posterior sampled imputation allows reusing existing knockoff samplers in the presence of missing values.
We also demonstrate how jointly imputing and sampling knockoffs can reduce the computational complexity.
arXiv Detail & Related papers (2022-02-26T04:05:31Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Identifying Wrongly Predicted Samples: A Method for Active Learning [6.976600214375139]
We propose a simple sample selection criterion that moves beyond uncertainty.
We show state-of-the-art results and better rates at identifying wrongly predicted samples.
arXiv Detail & Related papers (2020-10-14T09:00:42Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Query Training: Learning a Worse Model to Infer Better Marginals in
Undirected Graphical Models with Hidden Variables [11.985433487639403]
Probabilistic graphical models (PGMs) provide a compact representation of knowledge that can be queried in a flexible way.
We introduce query training (QT), a mechanism to learn a PGM that is optimized for the approximate inference algorithm that will be paired with it.
We demonstrate experimentally that QT can be used to learn a challenging 8-connected grid Markov random field with hidden variables.
arXiv Detail & Related papers (2020-06-11T20:34:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.