Unbiased Supervised Contrastive Learning
- URL: http://arxiv.org/abs/2211.05568v4
- Date: Thu, 4 May 2023 08:56:16 GMT
- Title: Unbiased Supervised Contrastive Learning
- Authors: Carlo Alberto Barbano, Benoit Dufumier, Enzo Tartaglione, Marco
Grangetto, Pietro Gori
- Abstract summary: In this work, we tackle the problem of learning representations that are robust to biases.
We first present a margin-based theoretical framework that allows us to clarify why recent contrastive losses can fail when dealing with biased data.
We derive a novel formulation of the supervised contrastive loss (epsilon-SupInfoNCE), providing more accurate control of the minimal distance between positive and negative samples.
Thanks to our theoretical framework, we also propose FairKL, a new debiasing regularization loss, that works well even with extremely biased data.
- Score: 10.728852691100338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many datasets are biased, namely they contain easy-to-learn features that are
highly correlated with the target class only in the dataset but not in the true
underlying distribution of the data. For this reason, learning unbiased models
from biased data has become a very relevant research topic in the last years.
In this work, we tackle the problem of learning representations that are robust
to biases. We first present a margin-based theoretical framework that allows us
to clarify why recent contrastive losses (InfoNCE, SupCon, etc.) can fail when
dealing with biased data. Based on that, we derive a novel formulation of the
supervised contrastive loss (epsilon-SupInfoNCE), providing more accurate
control of the minimal distance between positive and negative samples.
Furthermore, thanks to our theoretical framework, we also propose FairKL, a new
debiasing regularization loss, that works well even with extremely biased data.
We validate the proposed losses on standard vision datasets including CIFAR10,
CIFAR100, and ImageNet, and we assess the debiasing capability of FairKL with
epsilon-SupInfoNCE, reaching state-of-the-art performance on a number of biased
datasets, including real instances of biases in the wild.
Related papers
- Model Debiasing by Learnable Data Augmentation [19.625915578646758]
This paper proposes a novel 2-stage learning pipeline featuring a data augmentation strategy able to regularize the training.
Experiments on synthetic and realistic biased datasets show state-of-the-art classification accuracy, outperforming competing methods.
arXiv Detail & Related papers (2024-08-09T09:19:59Z) - AIM: Attributing, Interpreting, Mitigating Data Unfairness [40.351282126410545]
Existing fair machine learning (FairML) research has predominantly focused on mitigating discriminative bias in the model prediction.
We investigate a novel research problem: discovering samples that reflect biases/prejudices from the training data.
We propose practical algorithms for measuring and countering sample bias.
arXiv Detail & Related papers (2024-06-13T05:21:10Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Marginal Debiased Network for Fair Visual Recognition [59.05212866862219]
We propose a novel marginal debiased network (MDN) to learn debiased representations.
Our MDN can achieve a remarkable performance on under-represented samples.
arXiv Detail & Related papers (2024-01-04T08:57:09Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Unsupervised Learning of Unbiased Visual Representations [10.871587311621974]
Deep neural networks are known for their inability to learn robust representations when biases exist in the dataset.
We propose a fully unsupervised debiasing framework, consisting of three steps.
We employ state-of-the-art supervised debiasing techniques to obtain an unbiased model.
arXiv Detail & Related papers (2022-04-26T10:51:50Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To
Reduce Model Bias [10.639605996067534]
Contextual information is a valuable cue for Deep Neural Networks (DNNs) to learn better representations and improve accuracy.
In COCO, many object categories have a much higher co-occurrence with men compared to women, which can bias a DNN's prediction in favor of men.
We introduce a data repair algorithm using the coefficient of variation, which can curate fair and contextually balanced data for a protected class.
arXiv Detail & Related papers (2021-10-20T06:00:03Z) - AutoDebias: Learning to Debias for Recommendation [43.84313723394282]
We propose textitAotoDebias that leverages another (small) set of uniform data to optimize the debiasing parameters.
We derive the generalization bound for AutoDebias and prove its ability to acquire the appropriate debiasing strategy.
arXiv Detail & Related papers (2021-05-10T08:03:48Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.