Related papers: Assessing Social Alignment: Do Personality-Prompted Large Language Models Behave Like Humans?

Assessing Social Alignment: Do Personality-Prompted Large Language Models Behave Like Humans?

URL: http://arxiv.org/abs/2412.16772v3
Date: Mon, 04 Aug 2025 01:03:19 GMT
Title: Assessing Social Alignment: Do Personality-Prompted Large Language Models Behave Like Humans?
Authors: Ivan Zakazov, Mikolaj Boronski, Lorenzo Drudi, Robert West,
Abstract summary: State-of-the-art approaches exploit a vast variety of training data, and prompt the model to adopt a particular personality.<n>We use classic psychological experiments, the Milgram experiment and the Ultimatum Game, as social interaction testbeds.<n>Our experiments reveal failure modes of the prompt-based modulation of the models' behavior that are shared across all models tested and persist under prompt perturbations.
Score: 9.771036970279765
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ongoing revolution in language modeling has led to various novel applications, some of which rely on the emerging social abilities of large language models (LLMs). Already, many turn to the new cyber friends for advice during the pivotal moments of their lives and trust them with the deepest secrets, implying that accurate shaping of the LLM's personality is paramount. To this end, state-of-the-art approaches exploit a vast variety of training data, and prompt the model to adopt a particular personality. We ask (i) if personality-prompted models behave (i.e., make decisions when presented with a social situation) in line with the ascribed personality (ii) if their behavior can be finely controlled. We use classic psychological experiments, the Milgram experiment and the Ultimatum Game, as social interaction testbeds and apply personality prompting to open- and closed-source LLMs from 4 different vendors. Our experiments reveal failure modes of the prompt-based modulation of the models' behavior that are shared across all models tested and persist under prompt perturbations. These findings challenge the optimistic sentiment toward personality prompting generally held in the community.

Related papers

Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models [49.41113560646115]
We investigate various proxy measures of bias in large language models (LLMs)<n>We find that evaluating models with pre-prompted personae on a multi-subject benchmark (MMLU) leads to negligible and mostly random differences in scores.<n>With the recent trend for LLM assistant memory and personalization, these problems open up from a different angle.
arXiv Detail & Related papers (2025-06-12T08:47:40Z)
Personality Editing for Language Models through Relevant Knowledge Editing [9.55737598023151]
Large Language Models (LLMs) play a vital role in applications like conversational agents and content creation.<n>We introduce a novel method PALETTE that enhances personality control through knowledge editing.
arXiv Detail & Related papers (2025-02-17T13:28:14Z)
Humanity in AI: Detecting the Personality of Large Language Models [0.0]
Questionnaires are a common method for detecting the personality of Large Language Models (LLMs) We propose combining text mining with questionnaires method. We find that the personalities of LLMs are derived from their pre-trained data.
arXiv Detail & Related papers (2024-10-11T05:53:11Z)
Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data [58.92110996840019]
We propose to enhance role-playing language models (RPLMs) via personality-indicative data. Specifically, we leverage questions from psychological scales and distill advanced RPAs to generate dialogues that grasp the minds of characters. Experimental results validate that RPLMs trained with our dataset exhibit advanced role-playing capabilities for both general and personality-related evaluations.
arXiv Detail & Related papers (2024-06-27T06:24:00Z)
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics [29.325576963215163]
Large Language Models (LLMs) have led to their adaptation in various domains as conversational agents. We introduce TRAIT, a new benchmark consisting of 8K multi-choice questions designed to assess the personality of LLMs. LLMs exhibit distinct and consistent personality, which is highly influenced by their training data.
arXiv Detail & Related papers (2024-06-20T19:50:56Z)
P-React: Synthesizing Topic-Adaptive Reactions of Personality Traits via Mixture of Specialized LoRA Experts [34.374681921626205]
We propose P-React, a mixture of experts (MoE)-based personalized large language models.<n> Particularly, we integrate a Personality Loss (PSL) to better capture individual trait expressions.<n>To facilitate research in this field, we curate OCEAN-Chat, a high-quality, human-verified dataset.
arXiv Detail & Related papers (2024-06-18T12:25:13Z)
LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts. Most existing methods learn post features directly by fine-tuning the pre-trained language models. We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z)
Identifying Multiple Personalities in Large Language Models with External Evaluation [6.657168333238573]
Large Language Models (LLMs) are integrated with human daily applications rapidly. Many recent studies quantify LLMs' personalities using self-assessment tests that are created for humans. Yet many critiques question the applicability and reliability of these self-assessment tests when applied to LLMs.
arXiv Detail & Related papers (2024-02-22T18:57:20Z)
ControlLM: Crafting Diverse Personalities for Language Models [32.411304295746746]
We introduce ControlLM, which leverages differential activation patterns, derived from contrasting behavioral prompts in the model's latent space, to influence the model's personality traits at inference. First, we demonstrate ControlLM's capacity to elicit diverse persona behaviors without any training, while precision control allows personality traits to closely match average human values. We showcase improved reasoning and question answering through selective amplification of beneficial attributes like conscientiousness and friendliness.
arXiv Detail & Related papers (2024-02-15T17:58:29Z)
Eliciting Personality Traits in Large Language Models [0.0]
Large Language Models (LLMs) are increasingly being utilized by both candidates and employers in the recruitment context. This study seeks to obtain a better understanding of such models by examining their output variations based on different input prompts.
arXiv Detail & Related papers (2024-02-13T10:09:00Z)
Illuminating the Black Box: A Psychometric Investigation into the Multifaceted Nature of Large Language Models [3.692410936160711]
This study explores the idea of AI Personality or AInality suggesting that Large Language Models (LLMs) exhibit patterns similar to human personalities. Using projective tests, we uncover hidden aspects of LLM personalities that are not easily accessible through direct questioning. Our machine learning analysis revealed that LLMs exhibit distinct AInality traits and manifest diverse personality types, demonstrating dynamic shifts in response to external instructions.
arXiv Detail & Related papers (2023-12-21T04:57:21Z)
Editing Personality for Large Language Models [73.59001811199823]
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs) We construct PersonalityEdit, a new benchmark dataset to address this task.
arXiv Detail & Related papers (2023-10-03T16:02:36Z)
Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench [83.41621219298489]
We propose a framework, PsychoBench, for evaluating diverse psychological aspects of Large Language Models (LLMs) PsychoBench classifies these scales into four distinct categories: personality traits, interpersonal relationships, motivational tests, and emotional abilities. We employ a jailbreak approach to bypass the safety alignment protocols and test the intrinsic natures of LLMs.
arXiv Detail & Related papers (2023-10-02T17:46:09Z)
Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models [2.918940961856197]
We aim to investigate the feasibility of using the Myers-Briggs Type Indicator (MBTI), a widespread human personality assessment tool, as an evaluation metric for large language models (LLMs) Specifically, experiments will be conducted to explore: 1) the personality types of different LLMs, 2) the possibility of changing the personality types by prompt engineering, and 3) How does the training dataset affect the model's personality.
arXiv Detail & Related papers (2023-07-30T09:34:35Z)
The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling Probabilistic Social Inferences from Linguistic Inputs [50.32802502923367]
We study the process of language driving and influencing social reasoning in a probabilistic goal inference domain. We propose a neuro-symbolic model that carries out goal inference from linguistic inputs of agent scenarios. Our model closely matches human response patterns and better predicts human judgements than using an LLM alone.
arXiv Detail & Related papers (2023-06-25T19:38:01Z)
Revisiting the Reliability of Psychological Scales on Large Language Models [62.57981196992073]
This study aims to determine the reliability of applying personality assessments to Large Language Models. Analysis of 2,500 settings per model, including GPT-3.5, GPT-4, Gemini-Pro, and LLaMA-3.1, reveals that various LLMs show consistency in responses to the Big Five Inventory.
arXiv Detail & Related papers (2023-05-31T15:03:28Z)
Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors. To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors. MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories. We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.