Decision Theoretic Bootstrapping
- URL: http://arxiv.org/abs/2103.09982v1
- Date: Thu, 18 Mar 2021 02:00:24 GMT
- Title: Decision Theoretic Bootstrapping
- Authors: Peyman Tavallali, Hamed Hamze Bajgiran, Danial J. Esaid, Houman Owhadi
- Abstract summary: The design and testing of supervised machine learning models combine two fundamental distributions.
We present a general decision-theoretic bootstrapping solution to this problem.
- Score: 1.988145627448243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The design and testing of supervised machine learning models combine two
fundamental distributions: (1) the training data distribution (2) the testing
data distribution. Although these two distributions are identical and
identifiable when the data set is infinite; they are imperfectly known (and
possibly distinct) when the data is finite (and possibly corrupted) and this
uncertainty must be taken into account for robust Uncertainty Quantification
(UQ). We present a general decision-theoretic bootstrapping solution to this
problem: (1) partition the available data into a training subset and a UQ
subset (2) take $m$ subsampled subsets of the training set and train $m$ models
(3) partition the UQ set into $n$ sorted subsets and take a random fraction of
them to define $n$ corresponding empirical distributions $\mu_{j}$ (4) consider
the adversarial game where Player I selects a model $i\in\left\{
1,\ldots,m\right\} $, Player II selects the UQ distribution $\mu_{j}$ and
Player I receives a loss defined by evaluating the model $i$ against data
points sampled from $\mu_{j}$ (5) identify optimal mixed strategies
(probability distributions over models and UQ distributions) for both players.
These randomized optimal mixed strategies provide optimal model mixtures and UQ
estimates given the adversarial uncertainty of the training and testing
distributions represented by the game. The proposed approach provides (1) some
degree of robustness to distributional shift in both the distribution of
training data and that of the testing data (2) conditional probability
distributions on the output space forming aleatory representations of the
uncertainty on the output as a function of the input variable.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Universal Batch Learning Under The Misspecification Setting [4.772817128620037]
We consider the problem of universal em batch learning in a misspecification setting with log-loss.
We derive the optimal universal learner, a mixture over the set of the data generating distributions, and get a closed form expression for the min-max regret.
arXiv Detail & Related papers (2024-05-12T11:16:05Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Training Implicit Generative Models via an Invariant Statistical Loss [3.139474253994318]
Implicit generative models have the capability to learn arbitrary complex data distributions.
On the downside, training requires telling apart real data from artificially-generated ones using adversarial discriminators.
We develop a discriminator-free method for training one-dimensional (1D) generative implicit models.
arXiv Detail & Related papers (2024-02-26T09:32:28Z) - Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift.
We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution.
We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z) - Dr. FERMI: A Stochastic Distributionally Robust Fair Empirical Risk
Minimization Framework [12.734559823650887]
In the presence of distribution shifts, fair machine learning models may behave unfairly on test data.
Existing algorithms require full access to data and cannot be used when small batches are used.
This paper proposes the first distributionally robust fairness framework with convergence guarantees that do not require knowledge of the causal graph.
arXiv Detail & Related papers (2023-09-20T23:25:28Z) - Distribution Shift Inversion for Out-of-Distribution Prediction [57.22301285120695]
We propose a portable Distribution Shift Inversion algorithm for Out-of-Distribution (OoD) prediction.
We show that our method provides a general performance gain when plugged into a wide range of commonly used OoD algorithms.
arXiv Detail & Related papers (2023-06-14T08:00:49Z) - Diagnosing Model Performance Under Distribution Shift [9.143551270841858]
Prediction models can perform poorly when deployed to target distributions different from the training distribution.
Our approach decomposes the performance drop into terms for 1) an increase in harder but frequently seen examples from training, 2) changes in the relationship between features and outcomes, and 3) poor performance on examples infrequent or unseen during training.
arXiv Detail & Related papers (2023-03-03T15:27:16Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks.
Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target.
Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.