Bias Amplification: Large Language Models as Increasingly Biased Media
- URL: http://arxiv.org/abs/2410.15234v2
- Date: Mon, 17 Feb 2025 07:49:14 GMT
- Title: Bias Amplification: Large Language Models as Increasingly Biased Media
- Authors: Ze Wang, Zekun Wu, Jeremy Zhang, Xin Guan, Navya Jain, Skylar Lu, Saloni Gupta, Adriano Koshiyama,
- Abstract summary: We study the progressive reinforcement of preexisting social biases in Large Language Models (LLMs)<n>Our findings reveal a progressively increasing right-leaning bias.<n>A mechanistic interpretation identifies distinct sets of neurons responsible for model collapse and bias amplification, suggesting they arise from different underlying mechanisms.
- Score: 12.376194654498383
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Model collapse, a phenomenon where models degrade in performance due to indiscriminate use of synthetic data is well studied. However, its role in bias amplification, the progressive reinforcement of preexisting social biases in Large Language Models (LLMs) remains underexplored. In this paper, we formally define the conditions for bias amplification and demonstrate through statistical simulations that bias can intensify even in the absence of sampling errors, the primary driver of model collapse. Empirically, we investigate political bias amplification in GPT2 using a custom built benchmark for sentence continuation tasks. Our findings reveal a progressively increasing right-leaning bias. Furthermore, we evaluate three mitigation strategies, Overfitting, Preservation, and Accumulation, and show that bias amplification persists even when model collapse is mitigated. Finally, a mechanistic interpretation identifies distinct sets of neurons responsible for model collapse and bias amplification, suggesting they arise from different underlying mechanisms.
Related papers
- Exploring Bias in over 100 Text-to-Image Generative Models [49.60774626839712]
We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face.
We assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate.
Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased.
arXiv Detail & Related papers (2025-03-11T03:40:44Z) - An Effective Theory of Bias Amplification [18.648588509429167]
Machine learning models may capture and amplify biases present in data, leading to disparate test performance across social groups.
We propose a precise analytical theory in the context of ridge regression, where the former models neural networks in a simplified regime.
Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias.
arXiv Detail & Related papers (2024-10-07T08:43:22Z) - Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models [0.0]
Text-to-Image (TTI) systems have come under increased scrutiny for social biases.
We investigate embedding spaces as a source of bias for TTI models.
We find that biased multimodal embeddings like CLIP can result in lower alignment scores for representationally balanced TTI models.
arXiv Detail & Related papers (2024-09-15T01:09:55Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - A U-turn on Double Descent: Rethinking Parameter Counting in Statistical
Learning [68.76846801719095]
We show that double descent appears exactly when and where it occurs, and that its location is not inherently tied to the threshold p=n.
This provides a resolution to tensions between double descent and statistical intuition.
arXiv Detail & Related papers (2023-10-29T12:05:39Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - Extracting or Guessing? Improving Faithfulness of Event Temporal
Relation Extraction [87.04153383938969]
We improve the faithfulness of TempRel extraction models from two perspectives.
The first perspective is to extract genuinely based on contextual description.
The second perspective is to provide proper uncertainty estimation.
arXiv Detail & Related papers (2022-10-10T19:53:13Z) - A Systematic Study of Bias Amplification [16.245943270343343]
Recent research suggests that predictions made by machine-learning models can amplify biases present in the training data.
We perform the first systematic, controlled study into when and how bias amplification occurs.
arXiv Detail & Related papers (2022-01-27T18:04:24Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Debiased Models with Dynamic Gradient Alignment and
Bias-conflicting Sample Mining [39.00256193731365]
Deep neural networks notoriously suffer from dataset biases which are detrimental to model robustness, generalization and fairness.
We propose a two-stage debiasing scheme to combat against the intractable unknown biases.
arXiv Detail & Related papers (2021-11-25T14:50:10Z) - A Generative Approach for Mitigating Structural Biases in Natural
Language Inference [24.44419010439227]
In this work, we reformulate the NLI task as a generative task, where a model is conditioned on the biased subset of the input and the label.
We show that this approach is highly robust to large amounts of bias.
We find that generative models are difficult to train and they generally perform worse than discriminative baselines.
arXiv Detail & Related papers (2021-08-31T17:59:45Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z) - Mitigating Gender Bias Amplification in Distribution by Posterior
Regularization [75.3529537096899]
We investigate the gender bias amplification issue from the distribution perspective.
We propose a bias mitigation approach based on posterior regularization.
Our study sheds the light on understanding the bias amplification.
arXiv Detail & Related papers (2020-05-13T11:07:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.