A Survey on Fairness in Large Language Models
- URL: http://arxiv.org/abs/2308.10149v2
- Date: Wed, 21 Feb 2024 13:52:11 GMT
- Title: A Survey on Fairness in Large Language Models
- Authors: Yingji Li, Mengnan Du, Rui Song, Xin Wang, Ying Wang
- Abstract summary: Large Language Models (LLMs) have shown powerful performance and development prospects.
LLMs can capture social biases from unprocessed training data and propagate the biases to downstream tasks.
Unfair LLM systems have undesirable social impacts and potential harms.
- Score: 28.05516809190299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have shown powerful performance and development
prospects and are widely deployed in the real world. However, LLMs can capture
social biases from unprocessed training data and propagate the biases to
downstream tasks. Unfair LLM systems have undesirable social impacts and
potential harms. In this paper, we provide a comprehensive review of related
research on fairness in LLMs. Considering the influence of parameter magnitude
and training paradigm on research strategy, we divide existing fairness
research into oriented to medium-sized LLMs under pre-training and fine-tuning
paradigms and oriented to large-sized LLMs under prompting paradigms. First,
for medium-sized LLMs, we introduce evaluation metrics and debiasing methods
from the perspectives of intrinsic bias and extrinsic bias, respectively. Then,
for large-sized LLMs, we introduce recent fairness research, including fairness
evaluation, reasons for bias, and debiasing methods. Finally, we discuss and
provide insight on the challenges and future directions for the development of
fairness in LLMs.
Related papers
- Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge [84.34545223897578]
Despite their excellence in many domains, potential issues are under-explored, undermining their reliability and the scope of their utility.
We identify 12 key potential biases and propose a new automated bias quantification framework-CALM- which quantifies and analyzes each type of bias in LLM-as-a-Judge.
Our work highlights the need for stakeholders to address these issues and remind users to exercise caution in LLM-as-a-Judge applications.
arXiv Detail & Related papers (2024-10-03T17:53:30Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender.
This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z) - Fairness in Large Language Models in Three Hours [2.443957114877221]
This tutorial provides a systematic overview of recent advances in the literature concerning large language models.
The concept of fairness in LLMs is then explored, summarizing the strategies for evaluating bias and the algorithms designed to promote fairness.
arXiv Detail & Related papers (2024-08-02T03:44:14Z) - Fairness in Large Language Models: A Taxonomic Survey [2.669847575321326]
Large Language Models (LLMs) have demonstrated remarkable success across various domains.
Despite their promising performance in numerous real-world applications, most of these algorithms lack fairness considerations.
arXiv Detail & Related papers (2024-03-31T22:22:53Z) - Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact.
We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - A Group Fairness Lens for Large Language Models [34.0579082699443]
Large language models can perpetuate biases and unfairness when deployed in social media contexts.
We propose evaluating LLM biases from a group fairness lens using a novel hierarchical schema characterizing diverse social groups.
We pioneer a novel chain-of-thought method GF-Think to mitigate biases of LLMs from a group fairness perspective.
arXiv Detail & Related papers (2023-12-24T13:25:15Z) - Selecting Shots for Demographic Fairness in Few-Shot Learning with Large
Language Models [14.772568847965408]
We explore the effect of shots, which directly affect the performance of models, on the fairness of large language models (LLMs) as NLP classification systems.
We consider how different shot selection strategies, both existing and new demographically sensitive methods, affect model fairness across three standard fairness datasets.
arXiv Detail & Related papers (2023-11-14T19:02:03Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.