Different Bias Under Different Criteria: Assessing Bias in LLMs with a Fact-Based Approach
- URL: http://arxiv.org/abs/2411.17338v1
- Date: Tue, 26 Nov 2024 11:32:43 GMT
- Title: Different Bias Under Different Criteria: Assessing Bias in LLMs with a Fact-Based Approach
- Authors: Changgeon Ko, Jisu Shin, Hoyun Song, Jeongyeon Seo, Jong C. Park,
- Abstract summary: Large language models (LLMs) often reflect real-world biases, leading to efforts to mitigate these effects.
We introduce a novel metric to assess bias using fact-based criteria and real-world statistics.
- Score: 7.969162168078149
- License:
- Abstract: Large language models (LLMs) often reflect real-world biases, leading to efforts to mitigate these effects and make the models unbiased. Achieving this goal requires defining clear criteria for an unbiased state, with any deviation from these criteria considered biased. Some studies define an unbiased state as equal treatment across diverse demographic groups, aiming for balanced outputs from LLMs. However, differing perspectives on equality and the importance of pluralism make it challenging to establish a universal standard. Alternatively, other approaches propose using fact-based criteria for more consistent and objective evaluations, though these methods have not yet been fully applied to LLM bias assessments. Thus, there is a need for a metric with objective criteria that offers a distinct perspective from equality-based approaches. Motivated by this need, we introduce a novel metric to assess bias using fact-based criteria and real-world statistics. In this paper, we conducted a human survey demonstrating that humans tend to perceive LLM outputs more positively when they align closely with real-world demographic distributions. Evaluating various LLMs with our proposed metric reveals that model bias varies depending on the criteria used, highlighting the need for multi-perspective assessment.
Related papers
- Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models [10.565316815513235]
Large language models (LLMs) may still exhibit implicit biases when simulating human behavior.
We show that state-of-the-art LLMs exhibit significant sociodemographic disparities in nearly all simulations.
When comparing our findings to real-world disparities reported in empirical studies, we find that the biases we uncovered are directionally aligned but markedly amplified.
arXiv Detail & Related papers (2025-01-29T05:21:31Z) - Diverging Preferences: When do Annotators Disagree and do Models Know? [92.24651142187989]
We develop a taxonomy of disagreement sources spanning 10 categories across four high-level classes.
We find that the majority of disagreements are in opposition with standard reward modeling approaches.
We develop methods for identifying diverging preferences to mitigate their influence on evaluation and training.
arXiv Detail & Related papers (2024-10-18T17:32:22Z) - Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs [0.0]
Large Language Models (LLMs) are being adopted across a wide range of tasks.
Recent research indicates that LLMs can harbor implicit biases even when they pass explicit bias evaluations.
This study highlights that newer or larger language models do not automatically exhibit reduced bias.
arXiv Detail & Related papers (2024-10-13T03:43:18Z) - GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks.
To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets.
We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z) - The African Woman is Rhythmic and Soulful: An Investigation of Implicit Biases in LLM Open-ended Text Generation [3.9945212716333063]
Implicit biases are significant because they influence the decisions made by Large Language Models (LLMs)
Traditionally, explicit bias tests or embedding-based methods are employed to detect bias, but these approaches can overlook more nuanced, implicit forms of bias.
We introduce two novel psychological-inspired methodologies to reveal and measure implicit biases through prompt-based and decision-making tasks.
arXiv Detail & Related papers (2024-07-01T13:21:33Z) - Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation [49.3814117521631]
Standard benchmarks of bias and fairness in large language models (LLMs) measure the association between social attributes implied in user prompts and short responses.
We develop analogous RUTEd evaluations from three contexts of real-world use.
We find that standard bias metrics have no significant correlation with the more realistic bias metrics.
arXiv Detail & Related papers (2024-02-20T01:49:15Z) - Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact.
We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - Measuring Implicit Bias in Explicitly Unbiased Large Language Models [14.279977138893846]
Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases.
We introduce two new measures of bias: LLM Implicit Bias, a prompt-based method for revealing implicit bias; and LLM Decision Bias, a strategy to detect subtle discrimination in decision-making tasks.
Using these measures, we found pervasive stereotype biases mirroring those in society in 8 value-aligned models across 4 social categories.
arXiv Detail & Related papers (2024-02-06T15:59:23Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.