Related papers: Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

URL: http://arxiv.org/abs/2305.18189v1
Date: Mon, 29 May 2023 16:29:22 GMT
Title: Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
Authors: Myra Cheng, Esin Durmus, Dan Jurafsky
Abstract summary: We present Marked Personas, a prompt-based method to measure stereotypes in large language models (LLMs) We find that portrayals generated by GPT-3.5 and GPT-4 contain higher rates of racial stereotypes than human-written portrayals using the same prompts. An intersectional lens reveals tropes that dominate portrayals of marginalized groups, such as tropicalism and the hypersexualization of minoritized women.
Score: 33.157279170602784
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To recognize and mitigate harms from large language models (LLMs), we need to understand the prevalence and nuances of stereotypes in LLM outputs. Toward this end, we present Marked Personas, a prompt-based method to measure stereotypes in LLMs for intersectional demographic groups without any lexicon or data labeling. Grounded in the sociolinguistic concept of markedness (which characterizes explicitly linguistically marked categories versus unmarked defaults), our proposed method is twofold: 1) prompting an LLM to generate personas, i.e., natural language descriptions, of the target demographic group alongside personas of unmarked, default groups; 2) identifying the words that significantly distinguish personas of the target group from corresponding unmarked ones. We find that the portrayals generated by GPT-3.5 and GPT-4 contain higher rates of racial stereotypes than human-written portrayals using the same prompts. The words distinguishing personas of marked (non-white, non-male) groups reflect patterns of othering and exoticizing these demographics. An intersectional lens further reveals tropes that dominate portrayals of marginalized groups, such as tropicalism and the hypersexualization of minoritized women. These representational harms have concerning implications for downstream applications like story generation.

Related papers

Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language. We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Are Large Language Models Ready for Travel Planning? [6.307444995285539]
Large language models (LLMs) show promise in hospitality and tourism, their ability to provide unbiased service across demographic groups remains unclear. This paper explores gender and ethnic biases when LLMs are utilized as travel planning assistants.
arXiv Detail & Related papers (2024-10-22T18:08:25Z)
Which Demographics do LLMs Default to During Annotation? [9.190535758368567]
Two research directions developed in the context of using large language models (LLM) for data annotations. We evaluate which attributes of human annotators LLMs inherently mimic. We observe notable influences related to gender, race, and age in demographic prompting.
arXiv Detail & Related papers (2024-10-11T14:02:42Z)
Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear. By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z)
White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency. We introduce the novel Language Agency Bias Evaluation benchmark. We unveil language agency social biases in 3 recent Large Language Model (LLM)-generated content.
arXiv Detail & Related papers (2024-04-16T12:27:54Z)
Laissez-Faire Harms: Algorithmic Biases in Generative Language Models [0.0]
We show that synthetically generated texts from five of the most pervasive LMs perpetuate harms of omission, subordination, and stereotyping for minoritized individuals. We find widespread evidence of bias to an extent that such individuals are hundreds to thousands of times more likely to encounter LM-generated outputs. Our findings highlight the urgent need to protect consumers from discriminatory harms caused by language models.
arXiv Detail & Related papers (2024-04-11T05:09:03Z)
Aligning with Whom? Large Language Models Have Gender and Racial Biases in Subjective NLP Tasks [15.015148115215315]
We conduct experiments on four popular large language models (LLMs) to investigate their capability to understand group differences and potential biases in their predictions for politeness and offensiveness. We find that for both tasks, model predictions are closer to the labels from White and female participants. More specifically, when being prompted to respond from the perspective of "Black" and "Asian" individuals, models show lower performance in predicting both overall scores as well as the scores from corresponding groups.
arXiv Detail & Related papers (2023-11-16T10:02:24Z)
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses. We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z)
Queer People are People First: Deconstructing Sexual Identity Stereotypes in Large Language Models [3.974379576408554]
Large Language Models (LLMs) are trained primarily on minimally processed web text. LLMs can inadvertently perpetuate stereotypes towards marginalized groups, like the LGBTQIA+ community.
arXiv Detail & Related papers (2023-06-30T19:39:01Z)
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale [61.555788332182395]
We investigate the potential for machine learning models to amplify dangerous and complex stereotypes. We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects.
arXiv Detail & Related papers (2022-11-07T18:31:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.