A Group Fairness Lens for Large Language Models
- URL: http://arxiv.org/abs/2312.15478v1
- Date: Sun, 24 Dec 2023 13:25:15 GMT
- Title: A Group Fairness Lens for Large Language Models
- Authors: Guanqun Bi, Lei Shen, Yuqiang Xie, Yanan Cao, Tiangang Zhu, Xiaodong
He
- Abstract summary: Large language models can perpetuate biases and unfairness when deployed in social media contexts.
We propose evaluating LLM biases from a group fairness lens using a novel hierarchical schema characterizing diverse social groups.
We pioneer a novel chain-of-thought method GF-Think to mitigate biases of LLMs from a group fairness perspective.
- Score: 34.0579082699443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid advancement of large language models has revolutionized various
applications but also raised crucial concerns about their potential to
perpetuate biases and unfairness when deployed in social media contexts.
Evaluating LLMs' potential biases and fairness has become crucial, as existing
methods rely on limited prompts focusing on just a few groups, lacking a
comprehensive categorical perspective. In this paper, we propose evaluating LLM
biases from a group fairness lens using a novel hierarchical schema
characterizing diverse social groups. Specifically, we construct a dataset,
GFair, encapsulating target-attribute combinations across multiple dimensions.
In addition, we introduce statement organization, a new open-ended text
generation task, to uncover complex biases in LLMs. Extensive evaluations of
popular LLMs reveal inherent safety concerns. To mitigate the biases of LLM
from a group fairness perspective, we pioneer a novel chain-of-thought method
GF-Think to mitigate biases of LLMs from a group fairness perspective.
Experimental results demonstrate its efficacy in mitigating bias in LLMs to
achieve fairness.
Related papers
- Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge [84.34545223897578]
Despite their excellence in many domains, potential issues are under-explored, undermining their reliability and the scope of their utility.
We identify 12 key potential biases and propose a new automated bias quantification framework-CALM- which quantifies and analyzes each type of bias in LLM-as-a-Judge.
Our work highlights the need for stakeholders to address these issues and remind users to exercise caution in LLM-as-a-Judge applications.
arXiv Detail & Related papers (2024-10-03T17:53:30Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Fairness in Large Language Models in Three Hours [2.443957114877221]
This tutorial provides a systematic overview of recent advances in the literature concerning large language models.
The concept of fairness in LLMs is then explored, summarizing the strategies for evaluating bias and the algorithms designed to promote fairness.
arXiv Detail & Related papers (2024-08-02T03:44:14Z) - CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks.
To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets.
We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z) - Inducing Group Fairness in LLM-Based Decisions [12.368678951470162]
Group fairness in Prompting Large Language Models (LLMs) is a well-studied problem.
We show that prompt-based classifiers may lead to unfair decisions.
We introduce several remediation techniques and benchmark their fairness and performance trade-offs.
arXiv Detail & Related papers (2024-06-24T15:45:20Z) - Fairness in Large Language Models: A Taxonomic Survey [2.669847575321326]
Large Language Models (LLMs) have demonstrated remarkable success across various domains.
Despite their promising performance in numerous real-world applications, most of these algorithms lack fairness considerations.
arXiv Detail & Related papers (2024-03-31T22:22:53Z) - Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact.
We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - A Survey on Fairness in Large Language Models [28.05516809190299]
Large Language Models (LLMs) have shown powerful performance and development prospects.
LLMs can capture social biases from unprocessed training data and propagate the biases to downstream tasks.
Unfair LLM systems have undesirable social impacts and potential harms.
arXiv Detail & Related papers (2023-08-20T03:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.