Towards detecting unanticipated bias in Large Language Models
- URL: http://arxiv.org/abs/2404.02650v1
- Date: Wed, 3 Apr 2024 11:25:20 GMT
- Title: Towards detecting unanticipated bias in Large Language Models
- Authors: Anna Kruspe,
- Abstract summary: Large Language Models (LLMs) have exhibited fairness issues similar to those in previous machine learning systems.
This research focuses on analyzing and quantifying these biases in training data and their impact on the decisions of these models.
- Score: 1.4589372436314496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last year, Large Language Models (LLMs) like ChatGPT have become widely available and have exhibited fairness issues similar to those in previous machine learning systems. Current research is primarily focused on analyzing and quantifying these biases in training data and their impact on the decisions of these models, alongside developing mitigation strategies. This research largely targets well-known biases related to gender, race, ethnicity, and language. However, it is clear that LLMs are also affected by other, less obvious implicit biases. The complex and often opaque nature of these models makes detecting such biases challenging, yet this is crucial due to their potential negative impact in various applications. In this paper, we explore new avenues for detecting these unanticipated biases in LLMs, focusing specifically on Uncertainty Quantification and Explainable AI methods. These approaches aim to assess the certainty of model decisions and to make the internal decision-making processes of LLMs more transparent, thereby identifying and understanding biases that are not immediately apparent. Through this research, we aim to contribute to the development of fairer and more transparent AI systems.
Related papers
- Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs [0.0]
Large Language Models (LLMs) are being adopted across a wide range of tasks.
Recent research indicates that LLMs can harbor implicit biases even when they pass explicit bias evaluations.
This study highlights that newer or larger language models do not automatically exhibit reduced bias.
arXiv Detail & Related papers (2024-10-13T03:43:18Z) - Cognitive Biases in Large Language Models for News Recommendation [68.90354828533535]
This paper explores the potential impact of cognitive biases on large language models (LLMs) based news recommender systems.
We discuss strategies to mitigate these biases through data augmentation, prompt engineering and learning algorithms aspects.
arXiv Detail & Related papers (2024-10-03T18:42:07Z) - AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment [37.985947029716016]
Large language models (LLMs) have shown advanced understanding capabilities but may inherit human biases from their training data.
We investigated whether LLMs are influenced by the threshold priming effect in relevance judgments.
arXiv Detail & Related papers (2024-09-24T12:23:15Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data [9.90951705988724]
Large Language Models (LLM) are prone to inheriting and amplifying societal biases.
LLM bias can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities.
arXiv Detail & Related papers (2024-08-20T23:54:26Z) - The African Woman is Rhythmic and Soulful: An Investigation of Implicit Biases in LLM Open-ended Text Generation [3.9945212716333063]
Implicit biases are significant because they influence the decisions made by Large Language Models (LLMs)
Traditionally, explicit bias tests or embedding-based methods are employed to detect bias, but these approaches can overlook more nuanced, implicit forms of bias.
We introduce two novel psychological-inspired methodologies to reveal and measure implicit biases through prompt-based and decision-making tasks.
arXiv Detail & Related papers (2024-07-01T13:21:33Z) - Investigating Bias in LLM-Based Bias Detection: Disparities between LLMs and Human Perception [13.592532358127293]
We investigate the presence and nature of bias within Large Language Models (LLMs)
We probe whether LLMs exhibit biases, particularly in political bias prediction and text continuation tasks.
We propose debiasing strategies, including prompt engineering and model fine-tuning.
arXiv Detail & Related papers (2024-03-22T00:59:48Z) - Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing.
Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image.
To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Individual Explanations in Machine Learning Models: A Survey for
Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise.
Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways.
Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.