Related papers: A study on the distribution of social biases in self-supervised learning visual models

A study on the distribution of social biases in self-supervised learning visual models

URL: http://arxiv.org/abs/2203.01854v1
Date: Thu, 3 Mar 2022 17:03:21 GMT
Title: A study on the distribution of social biases in self-supervised learning visual models
Authors: Kirill Sirotkin, Pablo Carballeira, Marcos Escudero-Vi\~nolo
Abstract summary: Self-Supervised Learning (SSL) wrongly appears as an efficient and bias-free solution, as it does not require labelled data. We show that there is a correlation between the type of the SSL model and the number of biases that it incorporates. We conclude that a careful SSL model selection process can reduce the number of social biases in the deployed model.
Score: 1.8692254863855964
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks are efficient at learning the data distribution if it is sufficiently sampled. However, they can be strongly biased by non-relevant factors implicitly incorporated in the training data. These include operational biases, such as ineffective or uneven data sampling, but also ethical concerns, as the social biases are implicitly present\textemdash even inadvertently, in the training data or explicitly defined in unfair training schedules. In tasks having impact on human processes, the learning of social biases may produce discriminatory, unethical and untrustworthy consequences. It is often assumed that social biases stem from supervised learning on labelled data, and thus, Self-Supervised Learning (SSL) wrongly appears as an efficient and bias-free solution, as it does not require labelled data. However, it was recently proven that a popular SSL method also incorporates biases. In this paper, we study the biases of a varied set of SSL visual models, trained using ImageNet data, using a method and dataset designed by psychological experts to measure social biases. We show that there is a correlation between the type of the SSL model and the number of biases that it incorporates. Furthermore, the results also suggest that this number does not strictly depend on the model's accuracy and changes throughout the network. Finally, we conclude that a careful SSL model selection process can reduce the number of social biases in the deployed model, whilst keeping high performance.

Related papers

How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs. Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective [6.164100243945264]
Semi-supervised learning (SSL) commonly exhibits confirmation bias, where models disproportionately favor certain classes. We introduce TaMatch, a unified framework for debiased training in SSL. We show that TaMatch significantly outperforms existing state-of-the-art methods across a range of challenging image classification tasks.
arXiv Detail & Related papers (2024-09-26T21:50:30Z)
On the social bias of speech self-supervised models [45.787612513520386]
Social bias in SSL models can perpetuate injustice by automating discriminatory patterns and reinforcing inequitable systems. We probe how various factors, such as model architecture, size, and training methodologies, influence the propagation of social bias within these models. Our findings reveal that employing techniques such as row-pruning and training wider, shallower models can effectively mitigate social bias within SSL model.
arXiv Detail & Related papers (2024-06-07T15:07:07Z)
Evaluating Fairness in Self-supervised and Supervised Models for Sequential Data [10.626503137418636]
Self-supervised learning (SSL) has become the de facto training paradigm of large models. This study explores the impact of pre-training and fine-tuning strategies on fairness.
arXiv Detail & Related papers (2024-01-03T09:31:43Z)
Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios. Existing debiasing methods suffer from high costs in bias labeling or model re-training. We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z)
Progressive Feature Adjustment for Semi-supervised Learning from Pretrained Models [39.42802115580677]
Semi-supervised learning (SSL) can leverage both labeled and unlabeled data to build a predictive model. Recent literature suggests that naively applying state-of-the-art SSL with a pretrained model fails to unleash the full potential of training data. We propose to use pseudo-labels from the unlabelled data to update the feature extractor that is less sensitive to incorrect labels.
arXiv Detail & Related papers (2023-09-09T01:57:14Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture. We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z)
Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance. Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations. We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.