Related papers: StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models

StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models

URL: http://arxiv.org/abs/2310.13673v2
Date: Tue, 31 Oct 2023 16:41:31 GMT
Title: StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models
Authors: Sullam Jeoung, Yubin Ge, Jana Diesner
Abstract summary: Large Language Models (LLMs) have been observed to encode and perpetuate harmful associations present in the training data. We propose a theoretically grounded framework called StereoMap to gain insights into their perceptions of how demographic groups have been viewed by society.
Score: 11.218531873222398
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models (LLMs) have been observed to encode and perpetuate harmful associations present in the training data. We propose a theoretically grounded framework called StereoMap to gain insights into their perceptions of how demographic groups have been viewed by society. The framework is grounded in the Stereotype Content Model (SCM); a well-established theory from psychology. According to SCM, stereotypes are not all alike. Instead, the dimensions of Warmth and Competence serve as the factors that delineate the nature of stereotypes. Based on the SCM theory, StereoMap maps LLMs' perceptions of social groups (defined by socio-demographic features) using the dimensions of Warmth and Competence. Furthermore, the framework enables the investigation of keywords and verbalizations of reasoning of LLMs' judgments to uncover underlying factors influencing their perceptions. Our results show that LLMs exhibit a diverse range of perceptions towards these groups, characterized by mixed evaluations along the dimensions of Warmth and Competence. Furthermore, analyzing the reasonings of LLMs, our findings indicate that LLMs demonstrate an awareness of social disparities, often stating statistical data and research findings to support their reasoning. This study contributes to the understanding of how LLMs perceive and represent social groups, shedding light on their potential biases and the perpetuation of harmful associations.

Related papers

Fairness Mediator: Neutralize Stereotype Associations to Mitigate Bias in Large Language Models [66.5536396328527]
LLMs inadvertently absorb spurious correlations from training data, leading to stereotype associations between biased concepts and specific social groups. We propose Fairness Mediator (FairMed), a bias mitigation framework that neutralizes stereotype associations. Our framework comprises two main components: a stereotype association prober and an adversarial debiasing neutralizer.
arXiv Detail & Related papers (2025-04-10T14:23:06Z)
Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language. We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear. By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z)
Are Social Sentiments Inherent in LLMs? An Empirical Study on Extraction of Inter-demographic Sentiments [14.143299702954023]
This study focuses on social groups defined in terms of nationality, religion, and race/ethnicity. We input questions regarding sentiments from one group to another into LLMs, apply sentiment analysis to the responses, and compare the results with social surveys.
arXiv Detail & Related papers (2024-08-08T08:13:25Z)
A Taxonomy of Stereotype Content in Large Language Models [4.4212441764241]
This study introduces a taxonomy of stereotype content in contemporary large language models (LLMs) We identify 14 stereotype dimensions (e.g., Morality, Ability, Health, Beliefs, Emotions) accounting for 90% of LLM stereotype associations. Our findings suggest that high-dimensional human stereotypes are reflected in LLMs and must be considered in AI auditing and debiasing to minimize unidentified harms.
arXiv Detail & Related papers (2024-07-31T21:14:41Z)
How Are LLMs Mitigating Stereotyping Harms? Learning from Search Engine Studies [0.0]
Commercial model development has focused efforts on'safety' training concerning legal liabilities at the expense of social impact evaluation. This mimics a similar trend which we could observe for search engine autocompletion some years prior. We present a novel evaluation task in the style of autocompletion prompts to assess stereotyping in LLMs.
arXiv Detail & Related papers (2024-07-16T14:04:35Z)
Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models [57.518784855080334]
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants. This paper presents a framework for investigating psychology dimension in LLMs, including psychological identification, assessment dataset curation, and assessment with results validation. We introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence.
arXiv Detail & Related papers (2024-06-25T16:09:08Z)
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models [11.132360309354782]
Social bias is shaped by the accumulation of social perceptions towards targets across various demographic identities. We propose a novel strategy to intuitively quantify social perceptions and suggest metrics that can evaluate the social biases within large language models.
arXiv Detail & Related papers (2024-06-06T13:32:09Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)
MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks [49.60689355674541]
A rich literature in cognitive science has studied people's causal and moral intuitions. This work has revealed a number of factors that systematically influence people's judgments. We test whether large language models (LLMs) make causal and moral judgments about text-based scenarios that align with human participants.
arXiv Detail & Related papers (2023-10-30T15:57:32Z)
Influence of External Information on Large Language Models Mirrors Social Cognitive Patterns [51.622612759892775]
Social cognitive theory explains how people learn and acquire knowledge through observing others. Recent years have witnessed the rapid development of large language models (LLMs) LLMs, as AI agents, can observe external information, which shapes their cognition and behaviors.
arXiv Detail & Related papers (2023-05-08T16:10:18Z)
Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models [12.475204687181067]
We adapt the Agency-Belief-Communion stereotype model as a framework for the systematic study and discovery of stereotypic-trait associations in language models (LMs) We introduce the sensitivity test (SeT) for measuring stereotypical associations from language models. We collect group-trait judgments from U.S.-based subjects to compare with English LM stereotypes.
arXiv Detail & Related papers (2022-06-23T13:22:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.