BiasEnsemble: Revisiting the Importance of Amplifying Bias for Debiasing
- URL: http://arxiv.org/abs/2205.14594v1
- Date: Sun, 29 May 2022 07:55:06 GMT
- Title: BiasEnsemble: Revisiting the Importance of Amplifying Bias for Debiasing
- Authors: Jungsoo Lee, Jeonghoon Park, Daeyoung Kim, Juyoung Lee, Edward Choi,
Jaegul Choo
- Abstract summary: "Debiasing" aims to train a classifier to be less susceptible to dataset bias.
$f_B$ is trained to focus on bias-aligned samples while $f_D$ is mainly trained with bias-conflicting samples.
We propose a novel biased sample selection method BiasEnsemble which removes the bias-conflicting samples.
- Score: 31.665352191081357
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In image classification, "debiasing" aims to train a classifier to be less
susceptible to dataset bias, the strong correlation between peripheral
attributes of data samples and a target class. For example, even if the frog
class in the dataset mainly consists of frog images with a swamp background
(i.e., bias-aligned samples), a debiased classifier should be able to correctly
classify a frog at a beach (i.e., bias-conflicting samples). Recent debiasing
approaches commonly use two components for debiasing, a biased model $f_B$ and
a debiased model $f_D$. $f_B$ is trained to focus on bias-aligned samples while
$f_D$ is mainly trained with bias-conflicting samples by concentrating on
samples which $f_B$ fails to learn, leading $f_D$ to be less susceptible to the
dataset bias. While the state-of-the-art debiasing techniques have aimed to
better train $f_D$, we focus on training $f_B$, an overlooked component until
now. Our empirical analysis reveals that removing the bias-conflicting samples
from the training set for $f_B$ is important for improving the debiasing
performance of $f_D$. This is due to the fact that the bias-conflicting samples
work as noisy samples for amplifying the bias for $f_B$. To this end, we
propose a novel biased sample selection method BiasEnsemble which removes the
bias-conflicting samples via leveraging additional biased models to construct a
bias-amplified dataset for training $f_B$. Our simple yet effective approach
can be directly applied to existing reweighting-based debiasing approaches,
obtaining consistent performance boost and achieving the state-of-the-art
performance on both synthetic and real-world datasets.
Related papers
- CosFairNet:A Parameter-Space based Approach for Bias Free Learning [1.9116784879310025]
Deep neural networks trained on biased data often inadvertently learn unintended inference rules.
We introduce a novel approach to address bias directly in the model's parameter space, preventing its propagation across layers.
We show enhanced classification accuracy and debiasing effectiveness across various synthetic and real-world datasets.
arXiv Detail & Related papers (2024-10-19T13:06:40Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - IBADR: an Iterative Bias-Aware Dataset Refinement Framework for
Debiasing NLU models [52.03761198830643]
We propose IBADR, an Iterative Bias-Aware dataset Refinement framework.
We first train a shallow model to quantify the bias degree of samples in the pool.
Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.
In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples.
arXiv Detail & Related papers (2023-11-01T04:50:38Z) - From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition [64.59093444558549]
We propose a simple, easy-to-implement, two-step training pipeline that we call From Fake to Real.
By training on real and synthetic data separately, FFR does not expose the model to the statistical differences between real and synthetic data.
Our experiments show that FFR improves worst group accuracy over the state-of-the-art by up to 20% over three datasets.
arXiv Detail & Related papers (2023-08-08T19:52:28Z) - Echoes: Unsupervised Debiasing via Pseudo-bias Labeling in an Echo
Chamber [17.034228910493056]
This paper presents experimental analyses revealing that the existing biased models overfit to bias-conflicting samples in the training data.
We propose a straightforward and effective method called Echoes, which trains a biased model and a target model with a different strategy.
Our approach achieves superior debiasing results compared to the existing baselines on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-06T13:13:18Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking.
Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z) - Learning Debiased Representation via Disentangled Feature Augmentation [19.348340314001756]
This paper presents an empirical analysis revealing that training with "diverse" bias-conflicting samples is crucial for debiasing.
We propose a novel feature-level data augmentation technique in order to synthesize diverse bias-conflicting samples.
arXiv Detail & Related papers (2021-07-03T08:03:25Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.