A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
- URL: http://arxiv.org/abs/2305.13862v2
- Date: Tue, 29 Aug 2023 13:55:13 GMT
- Title: A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
- Authors: Leonardo Ranaldi, Elena Sofia Ruzzetti, Davide Venditti, Dario
Onorati, Fabio Massimo Zanzotto
- Abstract summary: Cheap-to-Build Very Large-Language Models (CtB-LLMs) with affordable training are emerging as the next big revolution in natural language processing and understanding.
In this paper, we performed a large investigation of the bias of three families of CtB-LLMs.
We show that debiasing techniques are effective and usable.
- Score: 1.987426401990999
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cheap-to-Build Very Large-Language Models (CtB-LLMs) with affordable training
are emerging as the next big revolution in natural language processing and
understanding. These CtB-LLMs are democratizing access to trainable Very
Large-Language Models (VLLMs) and, thus, may represent the building blocks of
many NLP systems solving downstream tasks. Hence, a little or a large bias in
CtB-LLMs may cause huge harm. In this paper, we performed a large investigation
of the bias of three families of CtB-LLMs, and we showed that debiasing
techniques are effective and usable. Indeed, according to current tests, the
LLaMA and the OPT families have an important bias in gender, race, religion,
and profession. In contrast to the analysis for other LLMs, we discovered that
bias depends not on the number of parameters but on the perplexity. Finally,
the debiasing of OPT using LoRA reduces bias up to 4.12 points in the
normalized stereotype score.
Related papers
- Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs [50.40165119718928]
LongPiBench is a benchmark designed to assess positional bias involving multiple pieces of relevant information.
These experiments reveal that while most current models are robust against the "lost in the middle" issue, there exist significant biases related to the spacing of relevant information pieces.
arXiv Detail & Related papers (2024-10-18T17:41:19Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data [9.90951705988724]
Large Language Models (LLM) are prone to inheriting and amplifying societal biases.
LLM bias can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities.
arXiv Detail & Related papers (2024-08-20T23:54:26Z) - Promoting Equality in Large Language Models: Identifying and Mitigating the Implicit Bias based on Bayesian Theory [29.201402717025335]
Large language models (LLMs) are trained on extensive text corpora, which inevitably include biased information.
We have formally defined the implicit bias problem and developed an innovative framework for bias removal based on Bayesian theory.
arXiv Detail & Related papers (2024-08-20T07:40:12Z) - REFINE-LM: Mitigating Language Model Stereotypes via Reinforcement Learning [18.064064773660174]
We introduce REFINE-LM, a debiasing method that uses reinforcement learning to handle different types of biases without any fine-tuning.
By training a simple model on top of the word probability distribution of a LM, our bias reinforcement learning method enables model debiasing without human annotations.
Experiments conducted on a wide range of models, including several LMs, show that our method significantly reduces stereotypical biases while preserving LMs performance.
arXiv Detail & Related papers (2024-08-18T14:08:31Z) - BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization [0.0]
Large Language Models (LLMs) have become pivotal in advancing natural language processing, yet their potential to perpetuate biases poses significant concerns.
This paper introduces a new framework employing Direct Preference Optimization (DPO) to mitigate gender, racial, and religious biases in English text.
By developing a loss function that favors less biased over biased completions, our approach cultivates a preference for respectful and non-discriminatory language.
arXiv Detail & Related papers (2024-07-18T22:32:20Z) - Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement [75.7148545929689]
Large language models (LLMs) improve their performance through self-feedback on certain tasks while degrade on others.
We formally define LLM's self-bias - the tendency to favor its own generation.
We analyze six LLMs on translation, constrained text generation, and mathematical reasoning tasks.
arXiv Detail & Related papers (2024-02-18T03:10:39Z) - Self-Debiasing Large Language Models: Zero-Shot Recognition and
Reduction of Stereotypes [73.12947922129261]
We leverage the zero-shot capabilities of large language models to reduce stereotyping.
We show that self-debiasing can significantly reduce the degree of stereotyping across nine different social groups.
We hope this work opens inquiry into other zero-shot techniques for bias mitigation.
arXiv Detail & Related papers (2024-02-03T01:40:11Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.