"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
LLM-Generated Reference Letters
- URL: http://arxiv.org/abs/2310.09219v5
- Date: Fri, 1 Dec 2023 19:21:20 GMT
- Title: "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
LLM-Generated Reference Letters
- Authors: Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang,
Nanyun Peng
- Abstract summary: Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content.
This paper critically examines gender biases in LLM-generated reference letters.
- Score: 97.11173801187816
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Large Language Models (LLMs) have recently emerged as an effective tool to
assist individuals in writing various types of content, including professional
documents such as recommendation letters. Though bringing convenience, this
application also introduces unprecedented fairness concerns. Model-generated
reference letters might be directly used by users in professional scenarios. If
underlying biases exist in these model-constructed letters, using them without
scrutinization could lead to direct societal harms, such as sabotaging
application success rates for female applicants. In light of this pressing
issue, it is imminent and necessary to comprehensively study fairness issues
and associated harms in this real-world use case. In this paper, we critically
examine gender biases in LLM-generated reference letters. Drawing inspiration
from social science findings, we design evaluation methods to manifest biases
through 2 dimensions: (1) biases in language style and (2) biases in lexical
content. We further investigate the extent of bias propagation by analyzing the
hallucination bias of models, a term that we define to be bias exacerbation in
model-hallucinated contents. Through benchmarking evaluation on 2 popular LLMs-
ChatGPT and Alpaca, we reveal significant gender biases in LLM-generated
recommendation letters. Our findings not only warn against using LLMs for this
application without scrutinization, but also illuminate the importance of
thoroughly studying hidden biases and harms in LLM-generated professional
documents.
Related papers
- Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts [49.97673761305336]
We evaluate three large language models (LLMs) for their alignment with human narrative styles and potential gender biases.
Our findings indicate that, while these models generally produce text closely resembling human authored content, variations in stylistic features suggest significant gender biases.
arXiv Detail & Related papers (2024-06-27T19:26:11Z) - White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency.
We introduce the novel Language Agency Bias Evaluation benchmark.
We unveil language agency social biases in 3 recent Large Language Model (LLM)-generated content.
arXiv Detail & Related papers (2024-04-16T12:27:54Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses.
We propose an indirect probing framework based on conditional generation.
We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z) - Probing Explicit and Implicit Gender Bias through LLM Conditional Text
Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses.
We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z) - Using Large Language Models for Qualitative Analysis can Introduce
Serious Bias [0.09208007322096534]
Large Language Models (LLMs) are quickly becoming ubiquitous, but the implications for social science research are not yet well understood.
This paper asks whether LLMs can help us analyse large-N qualitative data from open-ended interviews, with an application to transcripts of interviews with Rohingya refugees in Cox's Bazaar, Bangladesh.
We find that a great deal of caution is needed in using LLMs to annotate text as there is a risk of introducing biases that can lead to misleading inferences.
arXiv Detail & Related papers (2023-09-29T11:19:15Z) - Gender bias and stereotypes in Large Language Models [0.6882042556551611]
This paper investigates Large Language Models' behavior with respect to gender stereotypes.
We use a simple paradigm to test the presence of gender bias, building on but differing from WinoBias.
Our contributions in this paper are as follows: (a) LLMs are 3-6 times more likely to choose an occupation that stereotypically aligns with a person's gender; (b) these choices align with people's perceptions better than with the ground truth as reflected in official job statistics; (d) LLMs ignore crucial ambiguities in sentence structure 95% of the time in our study items, but when explicitly prompted, they recognize
arXiv Detail & Related papers (2023-08-28T22:32:05Z) - The Unequal Opportunities of Large Language Models: Revealing
Demographic Bias through Job Recommendations [5.898806397015801]
We propose a simple method for analyzing and comparing demographic bias in Large Language Models (LLMs)
We demonstrate the effectiveness of our method by measuring intersectional biases within ChatGPT and LLaMA.
We identify distinct biases in both models toward various demographic identities, such as both models consistently suggesting low-paying jobs for Mexican workers.
arXiv Detail & Related papers (2023-08-03T21:12:54Z) - Queer People are People First: Deconstructing Sexual Identity
Stereotypes in Large Language Models [3.974379576408554]
Large Language Models (LLMs) are trained primarily on minimally processed web text.
LLMs can inadvertently perpetuate stereotypes towards marginalized groups, like the LGBTQIA+ community.
arXiv Detail & Related papers (2023-06-30T19:39:01Z) - Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias [33.99768156365231]
We introduce a causal formulation for bias measurement in generative language models.
We propose a benchmark called OccuGender, with a bias-measuring procedure to investigate occupational gender bias.
The results show that these models exhibit substantial occupational gender bias.
arXiv Detail & Related papers (2022-12-20T22:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.