Why resampling outperforms reweighting for correcting sampling bias with
stochastic gradients
- URL: http://arxiv.org/abs/2009.13447v3
- Date: Fri, 27 Aug 2021 16:28:01 GMT
- Title: Why resampling outperforms reweighting for correcting sampling bias with
stochastic gradients
- Authors: Jing An, Lexing Ying, Yuhua Zhu
- Abstract summary: Training machine learning models on biased data sets requires correction techniques to compensate for the bias.
We consider two commonly-used techniques, resampling and reweighting, that rebalance the proportions of the subgroups to maintain the desired objective function.
- Score: 10.860844636412862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A data set sampled from a certain population is biased if the subgroups of
the population are sampled at proportions that are significantly different from
their underlying proportions. Training machine learning models on biased data
sets requires correction techniques to compensate for the bias. We consider two
commonly-used techniques, resampling and reweighting, that rebalance the
proportions of the subgroups to maintain the desired objective function. Though
statistically equivalent, it has been observed that resampling outperforms
reweighting when combined with stochastic gradient algorithms. By analyzing
illustrative examples, we explain the reason behind this phenomenon using tools
from dynamical stability and stochastic asymptotics. We also present
experiments from regression, classification, and off-policy prediction to
demonstrate that this is a general phenomenon. We argue that it is imperative
to consider the objective function design and the optimization algorithm
together while addressing the sampling bias.
Related papers
- Optimal Downsampling for Imbalanced Classification with Generalized Linear Models [6.14486033794703]
We study optimal downsampling for imbalanced classification using generalized linear models (GLMs)
We propose a pseudo likelihood estimator and study its normality in the context of increasingly imbalanced populations.
arXiv Detail & Related papers (2024-10-11T17:08:13Z) - NETS: A Non-Equilibrium Transport Sampler [15.58993313831079]
We propose an algorithm, termed the Non-Equilibrium Transport Sampler (NETS)
NETS can be viewed as a variant of importance sampling (AIS) based on Jarzynski's equality.
We show that this drift is the minimizer of a variety of objective functions, which can all be estimated in an unbiased fashion.
arXiv Detail & Related papers (2024-10-03T17:35:38Z) - Differentiable Pareto-Smoothed Weighting for High-Dimensional Heterogeneous Treatment Effect Estimation [0.6906005491572401]
We develop a numerically robust estimator by weighted representation learning.
Our experimental results show that by effectively correcting the weight values, our proposed method outperforms the existing ones.
arXiv Detail & Related papers (2024-04-26T15:34:04Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - IBADR: an Iterative Bias-Aware Dataset Refinement Framework for
Debiasing NLU models [52.03761198830643]
We propose IBADR, an Iterative Bias-Aware dataset Refinement framework.
We first train a shallow model to quantify the bias degree of samples in the pool.
Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.
In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples.
arXiv Detail & Related papers (2023-11-01T04:50:38Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - Mitigating Dataset Bias by Using Per-sample Gradient [9.290757451344673]
We propose PGD (Per-sample Gradient-based Debiasing), that comprises three steps: training a model on uniform batch sampling, setting the importance of each sample in proportion to the norm of the sample gradient, and training the model using importance-batch sampling.
Compared with existing baselines for various synthetic and real-world datasets, the proposed method showed state-of-the-art accuracy for a the classification task.
arXiv Detail & Related papers (2022-05-31T11:41:02Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Bayesian analysis of the prevalence bias: learning and predicting from
imbalanced data [10.659348599372944]
This paper lays the theoretical and computational framework for training models, and for prediction, in the presence of prevalence bias.
It offers an alternative to principled training losses and complements test-time procedures based on selecting an operating point from summary curves.
It integrates seamlessly in the current paradigm of (deep) learning using backpropagation and naturally with Bayesian models.
arXiv Detail & Related papers (2021-07-31T14:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.