Learning from a Biased Sample
- URL: http://arxiv.org/abs/2209.01754v1
- Date: Mon, 5 Sep 2022 04:19:16 GMT
- Title: Learning from a Biased Sample
- Authors: Roshni Sahoo, Lihua Lei, Stefan Wager
- Abstract summary: We propose a method for learning a decision rule that minimizes the worst-case risk incurred under a family of test distributions.
We give statistical guarantees for learning a robust model using the method of sieves and propose a deep learning algorithm whose loss function captures our target.
- Score: 5.162622771922123
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The empirical risk minimization approach to data-driven decision making
assumes that we can learn a decision rule from training data drawn under the
same conditions as the ones we want to deploy it under. However, in a number of
settings, we may be concerned that our training sample is biased, and that some
groups (characterized by either observable or unobservable attributes) may be
under- or over-represented relative to the general population; and in this
setting empirical risk minimization over the training set may fail to yield
rules that perform well at deployment. Building on concepts from
distributionally robust optimization and sensitivity analysis, we propose a
method for learning a decision rule that minimizes the worst-case risk incurred
under a family of test distributions whose conditional distributions of
outcomes $Y$ given covariates $X$ differ from the conditional training
distribution by at most a constant factor, and whose covariate distributions
are absolutely continuous with respect to the covariate distribution of the
training data. We apply a result of Rockafellar and Uryasev to show that this
problem is equivalent to an augmented convex risk minimization problem. We give
statistical guarantees for learning a robust model using the method of sieves
and propose a deep learning algorithm whose loss function captures our
robustness target. We empirically validate our proposed method in simulations
and a case study with the MIMIC-III dataset.
Related papers
- Distributionally Robust Skeleton Learning of Discrete Bayesian Networks [9.46389554092506]
We consider the problem of learning the exact skeleton of general discrete Bayesian networks from potentially corrupted data.
We propose to optimize the most adverse risk over a family of distributions within bounded Wasserstein distance or KL divergence to the empirical distribution.
We present efficient algorithms and show the proposed methods are closely related to the standard regularized regression approach.
arXiv Detail & Related papers (2023-11-10T15:33:19Z) - Conformal Inference for Invariant Risk Minimization [12.049545417799125]
The application of machine learning models can be significantly impeded by the occurrence of distributional shifts.
One way to tackle this problem is to use invariant learning, such as invariant risk minimization (IRM), to acquire an invariant representation.
This paper develops methods for obtaining distribution-free prediction regions to describe uncertainty estimates for invariant representations.
arXiv Detail & Related papers (2023-05-22T03:48:38Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Risk Consistent Multi-Class Learning from Label Proportions [64.0125322353281]
This study addresses a multiclass learning from label proportions (MCLLP) setting in which training instances are provided in bags.
Most existing MCLLP methods impose bag-wise constraints on the prediction of instances or assign them pseudo-labels.
A risk-consistent method is proposed for instance classification using the empirical risk minimization framework.
arXiv Detail & Related papers (2022-03-24T03:49:04Z) - Approximate Regions of Attraction in Learning with Decision-Dependent
Distributions [11.304363655760513]
We analyze repeated risk minimization as the trajectories of the gradient flows of performative risk minimization.
We provide conditions to characterize the region of attraction for the various equilibria in this setting.
We introduce the notion of performative alignment, which provides a geometric condition on the convergence of repeated risk minimization to performative risk minimizers.
arXiv Detail & Related papers (2021-06-30T18:38:08Z) - Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks.
Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target.
Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - Principled learning method for Wasserstein distributionally robust
optimization with local perturbations [21.611525306059985]
Wasserstein distributionally robust optimization (WDRO) attempts to learn a model that minimizes the local worst-case risk in the vicinity of the empirical data distribution.
We propose a minimizer based on a novel approximation theorem and provide the corresponding risk consistency results.
Our results show that the proposed method achieves significantly higher accuracy than baseline models on noisy datasets.
arXiv Detail & Related papers (2020-06-05T09:32:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.