Related papers: Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

URL: http://arxiv.org/abs/2206.11684v1
Date: Thu, 23 Jun 2022 13:22:24 GMT
Title: Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models
Authors: Yang Trista Cao, Anna Sotnikova, Hal Daum\'e III, Rachel Rudinger, Linda Zou
Abstract summary: We adapt the Agency-Belief-Communion stereotype model as a framework for the systematic study and discovery of stereotypic-trait associations in language models (LMs) We introduce the sensitivity test (SeT) for measuring stereotypical associations from language models. We collect group-trait judgments from U.S.-based subjects to compare with English LM stereotypes.
Score: 12.475204687181067
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: NLP models trained on text have been shown to reproduce human stereotypes, which can magnify harms to marginalized groups when systems are deployed at scale. We adapt the Agency-Belief-Communion (ABC) stereotype model of Koch et al. (2016) from social psychology as a framework for the systematic study and discovery of stereotypic group-trait associations in language models (LMs). We introduce the sensitivity test (SeT) for measuring stereotypical associations from language models. To evaluate SeT and other measures using the ABC model, we collect group-trait judgments from U.S.-based subjects to compare with English LM stereotypes. Finally, we extend this framework to measure LM stereotyping of intersectional identities.

Related papers

EuroGEST: Investigating gender stereotypes in multilingual language models [53.88459905621724]
Large language models increasingly support multiple languages, yet most benchmarks for gender bias remain English-centric.<n>We introduce EuroGEST, a dataset designed to measure gender-stereotypical reasoning in LLMs across English and 29 European languages.
arXiv Detail & Related papers (2025-06-04T11:58:18Z)
A Stereotype Content Analysis on Color-related Social Bias in Large Vision Language Models [5.12659586713042]
This study introduces new evaluation metrics based on the Stereotype Content Model (SCM)<n>We also propose BASIC, a benchmark for assessing gender, race, and color stereotypes.
arXiv Detail & Related papers (2025-05-27T08:44:05Z)
Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English [66.97110551643722]
We investigate dialectal disparities in Large Language Models (LLMs) reasoning tasks. We find that LLMs produce less accurate responses and simpler reasoning chains and explanations for AAE inputs. These findings highlight systematic differences in how LLMs process and reason about different language varieties.
arXiv Detail & Related papers (2025-03-06T05:15:34Z)
Detecting Linguistic Indicators for Stereotype Assessment with Large Language Models [0.9285295512807729]
Social categories and stereotypes are embedded in language and can introduce data bias into Large Language Models. We propose a new approach that detects and quantifies the linguistic indicators of stereotypes in a sentence.
arXiv Detail & Related papers (2025-02-26T14:15:28Z)
Large Language Models as Neurolinguistic Subjects: Identifying Internal Representations for Form and Meaning [49.60849499134362]
This study investigates the linguistic understanding of Large Language Models (LLMs) regarding signifier (form) and signified (meaning) Traditional psycholinguistic evaluations often reflect statistical biases that may misrepresent LLMs' true linguistic capabilities. We introduce a neurolinguistic approach, utilizing a novel method that combines minimal pair and diagnostic probing to analyze activation patterns across model layers.
arXiv Detail & Related papers (2024-11-12T04:16:44Z)
Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language. We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs) By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases. The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z)
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models [9.734705470760511]
We use GlobalBias to study a broad set of stereotypes from around the world. We generate character profiles based on given names and evaluate the prevalence of stereotypes in model outputs.
arXiv Detail & Related papers (2024-07-09T14:52:52Z)
White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency. We introduce the novel Language Agency Bias Evaluation benchmark. We unveil language agency social biases in 3 recent Large Language Model (LLM)-generated content.
arXiv Detail & Related papers (2024-04-16T12:27:54Z)
Social Bias Probing: Fairness Benchmarking for Language Models [38.180696489079985]
This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment. We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections. We show that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized.
arXiv Detail & Related papers (2023-11-15T16:35:59Z)
StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models [11.218531873222398]
Large Language Models (LLMs) have been observed to encode and perpetuate harmful associations present in the training data. We propose a theoretically grounded framework called StereoMap to gain insights into their perceptions of how demographic groups have been viewed by society.
arXiv Detail & Related papers (2023-10-20T17:22:30Z)
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale [61.555788332182395]
We investigate the potential for machine learning models to amplify dangerous and complex stereotypes. We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects.
arXiv Detail & Related papers (2022-11-07T18:31:07Z)
Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases. We propose steps towards mitigating social biases during text generation. Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)
Understanding and Countering Stereotypes: A Computational Approach to the Stereotype Content Model [4.916009028580767]
We present a computational approach to interpreting stereotypes in text through the Stereotype Content Model (SCM) The SCM proposes that stereotypes can be understood along two primary dimensions: warmth and competence. It is known that countering stereotypes with anti-stereotypical examples is one of the most effective ways to reduce biased thinking.
arXiv Detail & Related papers (2021-06-04T16:53:37Z)
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models [30.582132471411263]
We introduce the Crowd Stereotype Pairs benchmark (CrowS-Pairs) CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. We find that all three of the widely-used sentences we evaluate substantially favor stereotypes in every category in CrowS-Pairs.
arXiv Detail & Related papers (2020-09-30T22:38:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.