Aligning with Whom? Large Language Models Have Gender and Racial Biases
in Subjective NLP Tasks
- URL: http://arxiv.org/abs/2311.09730v1
- Date: Thu, 16 Nov 2023 10:02:24 GMT
- Title: Aligning with Whom? Large Language Models Have Gender and Racial Biases
in Subjective NLP Tasks
- Authors: Huaman Sun, Jiaxin Pei, Minje Choi, David Jurgens
- Abstract summary: We conduct experiments on four popular large language models (LLMs) to investigate their capability to understand group differences and potential biases in their predictions for politeness and offensiveness.
We find that for both tasks, model predictions are closer to the labels from White and female participants.
More specifically, when being prompted to respond from the perspective of "Black" and "Asian" individuals, models show lower performance in predicting both overall scores as well as the scores from corresponding groups.
- Score: 15.015148115215315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human perception of language depends on personal backgrounds like gender and
ethnicity. While existing studies have shown that large language models (LLMs)
hold values that are closer to certain societal groups, it is unclear whether
their prediction behaviors on subjective NLP tasks also exhibit a similar bias.
In this study, leveraging the POPQUORN dataset which contains annotations of
diverse demographic backgrounds, we conduct a series of experiments on four
popular LLMs to investigate their capability to understand group differences
and potential biases in their predictions for politeness and offensiveness. We
find that for both tasks, model predictions are closer to the labels from White
and female participants. We further explore prompting with the target
demographic labels and show that including the target demographic in the prompt
actually worsens the model's performance. More specifically, when being
prompted to respond from the perspective of "Black" and "Asian" individuals,
models show lower performance in predicting both overall scores as well as the
scores from corresponding groups. Our results suggest that LLMs hold gender and
racial biases for subjective NLP tasks and that demographic-infused prompts
alone may be insufficient to mitigate such effects. Code and data are available
at https://github.com/Jiaxin-Pei/LLM-Group-Bias.
Related papers
- Are Large Language Models Ready for Travel Planning? [6.307444995285539]
Large language models (LLMs) show promise in hospitality and tourism, their ability to provide unbiased service across demographic groups remains unclear.
This paper explores gender and ethnic biases when LLMs are utilized as travel planning assistants.
arXiv Detail & Related papers (2024-10-22T18:08:25Z) - Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases.
We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias.
As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z) - Evaluating Gender Bias in Large Language Models via Chain-of-Thought
Prompting [87.30837365008931]
Large language models (LLMs) equipped with Chain-of-Thought (CoT) prompting are able to make accurate incremental predictions even on unscalable tasks.
This study examines the impact of LLMs' step-by-step predictions on gender bias in unscalable tasks.
arXiv Detail & Related papers (2024-01-28T06:50:10Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty.
We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z) - MultiModal Bias: Introducing a Framework for Stereotypical Bias
Assessment beyond Gender and Race in Vision Language Models [40.12132844347926]
We provide a visual and textual bias benchmark called MMBias, consisting of around 3,800 images and phrases covering 14 population subgroups.
We utilize this dataset to assess bias in several prominent self supervised multimodal models, including CLIP, ALBEF, and ViLT.
We introduce a debiasing method designed specifically for such large pre-trained models that can be applied as a post-processing step to mitigate bias.
arXiv Detail & Related papers (2023-03-16T17:36:37Z) - Perturbation Augmentation for Fairer NLP [33.442601687940204]
Language models pre-trained on demographically perturbed corpora are more fair, at least, according to our best metrics for measuring model fairness.
Although our findings appear promising, there are still some limitations, as well as outstanding questions about how best to evaluate the (un)fairness of large language models.
arXiv Detail & Related papers (2022-05-25T09:00:29Z) - A Survey on Bias and Fairness in Natural Language Processing [1.713291434132985]
We analyze the origins of biases, the definitions of fairness, and how different subfields of NLP bias can be mitigated.
We discuss how future studies can work towards eradicating pernicious biases from NLP algorithms.
arXiv Detail & Related papers (2022-03-06T18:12:30Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.