Related papers: Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes

Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes

URL: http://arxiv.org/abs/2506.20822v1
Date: Wed, 25 Jun 2025 20:43:04 GMT
Title: Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes
Authors: Quintin Myers, Yanjun Gao,
Abstract summary: We present the first study to evaluate large language models (LLMs) using a validated social science instrument designed to measure human response to everyday conflict.<n>To assess potential bias, we introduce persona-based prompting that varies race, age, and geographic identity within the United States.<n>Our study reveals two key findings: (1) LLMs surface-level text generation often diverges from their internal preference for violent responses; (2) their violent tendencies vary across demographics, frequently contradicting established findings in criminology, social science, and psychology.
Score: 1.7188280334580197
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large language models (LLMs) are increasingly proposed for detecting and responding to violent content online, yet their ability to reason about morally ambiguous, real-world scenarios remains underexamined. We present the first study to evaluate LLMs using a validated social science instrument designed to measure human response to everyday conflict, namely the Violent Behavior Vignette Questionnaire (VBVQ). To assess potential bias, we introduce persona-based prompting that varies race, age, and geographic identity within the United States. Six LLMs developed across different geopolitical and organizational contexts are evaluated under a unified zero-shot setting. Our study reveals two key findings: (1) LLMs surface-level text generation often diverges from their internal preference for violent responses; (2) their violent tendencies vary across demographics, frequently contradicting established findings in criminology, social science, and psychology.

Related papers

Do Large Language Models Understand Morality Across Cultures? [0.5356944479760104]
This study investigates the extent to which large language models capture cross-cultural differences and similarities in moral perspectives.<n>Our results reveal that current LLMs often fail to reproduce the full spectrum of cross-cultural moral variation.<n>These findings highlight a pressing need for more robust approaches to mitigate biases and improve cultural representativeness in LLMs.
arXiv Detail & Related papers (2025-07-28T20:25:36Z)
Revisiting LLM Value Probing Strategies: Are They Robust and Expressive? [81.49470136653665]
We evaluate the robustness and expressiveness of value representations across three widely used probing strategies.<n>We show that the demographic context has little effect on the free-text generation, and the models' values only weakly correlate with their preference for value-based actions.
arXiv Detail & Related papers (2025-07-17T18:56:41Z)
Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations [8.769839351949997]
Large Language Models (LLMs) are capable of generating opinions and propagating bias unknowingly.<n>Our work proposes a novel method that quantitatively analyzes the opinions generated by LLMs.<n>We evaluate modern, open LLMs such as Llama and Mistral on surveys conducted in various global south countries.
arXiv Detail & Related papers (2025-03-10T16:32:03Z)
Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios [4.626073646852022]
We introduce a novel framework using cosine distance to measure semantic shifts in responses.<n>We assess how demographic prompts affect response quality across power-disparate social scenarios.<n>Our findings suggest a "default persona" bias toward middle-aged, able-bodied, native-born, Caucasian, atheistic males with centrist views.
arXiv Detail & Related papers (2025-03-03T13:44:03Z)
Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear. By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z)
Prompt and Prejudice [29.35618753825668]
This paper investigates the impact of using first names in Large Language Models (LLMs) and Vision Language Models (VLMs) We propose an approach that appends first names to ethically annotated text scenarios to reveal demographic biases in model outputs.
arXiv Detail & Related papers (2024-08-07T14:11:33Z)
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective [66.34066553400108]
We conduct a rigorous evaluation of large language models' implicit bias towards certain demographics.<n>Inspired by psychometric principles, we propose three attack approaches, i.e., Disguise, Deception, and Teaching.<n>Our methods can elicit LLMs' inner bias more effectively than competitive baselines.
arXiv Detail & Related papers (2024-06-20T06:42:08Z)
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency in Large Language Model (LLM)-generated content.<n>We introduce the Language Agency Bias Evaluation benchmark, which comprehensively evaluates biases in LLMs.<n>Using LABE, we unveil language agency social biases in 3 recent LLMs: ChatGPT, Llama3, and Mistral.
arXiv Detail & Related papers (2024-04-16T12:27:54Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)
AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction [0.0]
Large language models (LLMs) that produce human-like responses have begun to revolutionize research practices in the social sciences. We develop a novel methodological framework that fine-tunes LLMs with repeated cross-sectional surveys. We introduce two new applications of the AI-augmented survey: retrodiction (i.e., predict year-level missing responses) and unasked opinion prediction.
arXiv Detail & Related papers (2023-05-16T17:13:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.