On the social bias of speech self-supervised models
- URL: http://arxiv.org/abs/2406.04997v1
- Date: Fri, 7 Jun 2024 15:07:07 GMT
- Title: On the social bias of speech self-supervised models
- Authors: Yi-Cheng Lin, Tzu-Quan Lin, Hsi-Che Lin, Andy T. Liu, Hung-yi Lee,
- Abstract summary: Social bias in SSL models can perpetuate injustice by automating discriminatory patterns and reinforcing inequitable systems.
We probe how various factors, such as model architecture, size, and training methodologies, influence the propagation of social bias within these models.
Our findings reveal that employing techniques such as row-pruning and training wider, shallower models can effectively mitigate social bias within SSL model.
- Score: 45.787612513520386
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Self-supervised learning (SSL) speech models have achieved remarkable performance in various tasks, yet the biased outcomes, especially affecting marginalized groups, raise significant concerns. Social bias refers to the phenomenon where algorithms potentially amplify disparate properties between social groups present in the data used for training. Bias in SSL models can perpetuate injustice by automating discriminatory patterns and reinforcing inequitable systems. This work reveals that prevalent SSL models inadvertently acquire biased associations. We probe how various factors, such as model architecture, size, and training methodologies, influence the propagation of social bias within these models. Finally, we explore the efficacy of debiasing SSL models through regularization techniques, specifically via model compression. Our findings reveal that employing techniques such as row-pruning and training wider, shallower models can effectively mitigate social bias within SSL model.
Related papers
- Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender.
This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z) - The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases.
We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias.
As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z) - Using Self-supervised Learning Can Improve Model Fairness [10.028637666224093]
Self-supervised learning (SSL) has become the de facto training paradigm of large models.
This study explores the impact of pre-training and fine-tuning strategies on fairness.
We introduce a fairness assessment framework for SSL, comprising five stages: defining dataset requirements, pre-training, fine-tuning with gradual unfreezing, assessing representation similarity conditioned on demographics, and establishing domain-specific evaluation processes.
arXiv Detail & Related papers (2024-06-04T14:38:30Z) - Evaluating Fairness in Self-supervised and Supervised Models for
Sequential Data [10.626503137418636]
Self-supervised learning (SSL) has become the de facto training paradigm of large models.
This study explores the impact of pre-training and fine-tuning strategies on fairness.
arXiv Detail & Related papers (2024-01-03T09:31:43Z) - Understanding the Effect of Model Compression on Social Bias in Large
Language Models [12.289003145872481]
Large Language Models (LLMs) trained with self-supervision on vast corpora of web text fit to the social biases of that text.
We study the impact of model compression via quantization and knowledge distillation on measures of social bias in LLMs.
arXiv Detail & Related papers (2023-12-09T20:04:20Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - A study on the distribution of social biases in self-supervised learning
visual models [1.8692254863855964]
Self-Supervised Learning (SSL) wrongly appears as an efficient and bias-free solution, as it does not require labelled data.
We show that there is a correlation between the type of the SSL model and the number of biases that it incorporates.
We conclude that a careful SSL model selection process can reduce the number of social biases in the deployed model.
arXiv Detail & Related papers (2022-03-03T17:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.