Related papers: The Personality Trap: How LLMs Embed Bias When Generating Human-Like Personas

The Personality Trap: How LLMs Embed Bias When Generating Human-Like Personas

URL: http://arxiv.org/abs/2602.03334v1
Date: Tue, 03 Feb 2026 10:00:18 GMT
Title: The Personality Trap: How LLMs Embed Bias When Generating Human-Like Personas
Authors: Jacopo Amidei, Gregorio Ferreira, Mario Muñoz Serrano, Rubén Nieto, Andreas Kaltenbrunner,
Abstract summary: We first assess the representativeness and potential biases in the sociodemographic attributes of the generated personas.<n>All models exhibit pronounced WEIRD (western, educated, industrialized, rich and democratic) biases, favoring young, educated, white, heterosexual, Western individuals with centrist or progressive political views and secular or Christian beliefs.
Score: 1.2641141743223376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper examines biases in large language models (LLMs) when generating synthetic populations from responses to personality questionnaires. Using five LLMs, we first assess the representativeness and potential biases in the sociodemographic attributes of the generated personas, as well as their alignment with the intended personality traits. While LLMs successfully reproduce known correlations between personality and sociodemographic variables, all models exhibit pronounced WEIRD (western, educated, industrialized, rich and democratic) biases, favoring young, educated, white, heterosexual, Western individuals with centrist or progressive political views and secular or Christian beliefs. In a second analysis, we manipulate input traits to maximize Neuroticism and Psychoticism scores. Notably, when Psychoticism is maximized, several models produce an overrepresentation of non-binary and LGBTQ+ identities, raising concerns about stereotyping and the potential pathologization of marginalized groups. Our findings highlight both the potential and the risks of using LLMs to generate psychologically grounded synthetic populations.

Related papers

Interpretable Debiasing of Vision-Language Models for Social Fairness [55.85977929985967]
We introduce an interpretable, model-agnostic bias mitigation framework, DeBiasLens, that localizes social attribute neurons in Vision-Language models.<n>We train SAEs on facial image or caption datasets without corresponding social attribute labels to uncover neurons highly responsive to specific demographics.<n>Our research lays the groundwork for future auditing tools, prioritizing social fairness in emerging real-world AI systems.
arXiv Detail & Related papers (2026-02-27T13:37:11Z)
Social Simulations with Large Language Model Risk Utopian Illusion [61.358959720048354]
We introduce a systematic framework for analyzing large language models' behavior in social simulation.<n>Our approach simulates multi-agent interactions through chatroom-style conversations and analyzes them across five linguistic dimensions.<n>Our findings reveal that LLMs do not faithfully reproduce genuine human behavior but instead reflect overly idealized versions of it.
arXiv Detail & Related papers (2025-10-24T06:08:41Z)
Who are you, ChatGPT? Personality and Demographic Style in LLM-Generated Content [5.515596385935823]
Generative large language models (LLMs) have become central to everyday life, producing human-like text across diverse domains.<n>A growing body of research investigates whether these models also exhibit personality- and demographic-like characteristics in their language.<n>We introduce a novel, data-driven methodology for assessing LLM personality without relying on self-report questionnaires.<n>Applying instead automatic personality and gender classifiers to model replies on open-ended questions collected from Reddit.
arXiv Detail & Related papers (2025-10-13T14:06:17Z)
Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model's Empathy [1.6489674562395387]
We investigate how Large Language Models' cognitive and affective empathy vary across user personas defined by intersecting demographic attributes.<n>Our study introduces a novel intersectional analysis spanning 315 unique personas, constructed from combinations of age, culture, and gender.<n>We show that they broadly reflect real-world empathetic trends, with notable misalignments for certain groups, such as those from Confucian culture.
arXiv Detail & Related papers (2025-10-11T20:04:57Z)
Population-Aligned Persona Generation for LLM-based Social Simulation [58.84363795421489]
We propose a systematic framework for synthesizing high-quality, population-aligned persona sets for social simulation.<n>Our approach begins by leveraging large language models to generate narrative personas from long-term social media data.<n>To address the needs of specific simulation contexts, we introduce a task-specific module that adapts the globally aligned persona set to targeted subpopulations.
arXiv Detail & Related papers (2025-09-12T10:43:47Z)
Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios [4.626073646852022]
We introduce a novel framework using cosine distance to measure semantic shifts in responses.<n>We assess how demographic prompts affect response quality across power-disparate social scenarios.<n>Our findings suggest a "default persona" bias toward middle-aged, able-bodied, native-born, Caucasian, atheistic males with centrist views.
arXiv Detail & Related papers (2025-03-03T13:44:03Z)
Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Evaluating Large Language Models with Psychometrics [59.821829073478376]
This paper offers a comprehensive benchmark for quantifying psychological constructs of Large Language Models (LLMs)<n>Our work identifies five key psychological constructs -- personality, values, emotional intelligence, theory of mind, and self-efficacy -- assessed through a suite of 13 datasets.<n>We uncover significant discrepancies between LLMs' self-reported traits and their response patterns in real-world scenarios, revealing complexities in their behaviors.
arXiv Detail & Related papers (2024-06-25T16:09:08Z)
Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs [67.51906565969227]
We study the unintended side-effects of persona assignment on the ability of LLMs to perform basic reasoning tasks. Our study covers 24 reasoning datasets, 4 LLMs, and 19 diverse personas (e.g. an Asian person) spanning 5 socio-demographic groups.
arXiv Detail & Related papers (2023-11-08T18:52:17Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty. We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z)
Large Language Models Can Infer Psychological Dispositions of Social Media Users [1.0923877073891446]
We test whether GPT-3.5 and GPT-4 can derive the Big Five personality traits from users' Facebook status updates in a zero-shot learning scenario. Our results show an average correlation of r =.29 (range = [.22,.33]) between LLM-inferred and self-reported trait scores. predictions were found to be more accurate for women and younger individuals on several traits, suggesting a potential bias stemming from the underlying training data or differences in online self-expression.
arXiv Detail & Related papers (2023-09-13T01:27:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.