Personality Editing for Language Models through Adjusting Self-Referential Queries
- URL: http://arxiv.org/abs/2502.11789v3
- Date: Mon, 13 Oct 2025 06:53:27 GMT
- Title: Personality Editing for Language Models through Adjusting Self-Referential Queries
- Authors: Seojin Hwang, Yumin Kim, Byeongjeong Kim, Donghoon Shin, Hwanhee Lee,
- Abstract summary: We present PALETTE (Personality Adjustment by LLM SElf-TargeTed quEries), a novel method for personality editing in Large Language Models (LLMs)<n>Our approach introduces adjustment queries, where self-referential statements grounded in psychological constructs are treated analogously to factual knowledge, enabling direct editing of personality-related responses.<n>Unlike fine-tuning, PALETTE requires only 12 editing samples to achieve substantial improvements in personality alignment across personality dimensions.
- Score: 17.051166122108857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) are integral to applications such as conversational agents and content creation, where precise control over a model's personality is essential for maintaining tone, consistency, and user engagement. However, prevailing prompt-based or fine-tuning approaches either lack robustness or demand large-scale training data, making them costly and impractical. In this paper, we present PALETTE (Personality Adjustment by LLM SElf-TargeTed quEries), a novel method for personality editing in LLMs. Our approach introduces adjustment queries, where self-referential statements grounded in psychological constructs are treated analogously to factual knowledge, enabling direct editing of personality-related responses. Unlike fine-tuning, PALETTE requires only 12 editing samples to achieve substantial improvements in personality alignment across personality dimensions. Experimental results from both automatic and human evaluations demonstrate that our method enables more stable and well-balanced personality control in LLMs.
Related papers
- PTCBENCH: Benchmarking Contextual Stability of Personality Traits in LLM Systems [30.449659477704543]
We introduce PTCBENCH, a benchmark designed to quantify the consistency of large language models (LLMs) personalities under controlled situational contexts.<n> PTCBENCH subjects models to 12 distinct external conditions spanning diverse location contexts and life events, and rigorously assesses the personality using the NEO Five-Factor Inventory.<n>Our study on 39,240 personality trait records reveals that certain external scenarios can trigger significant personality changes of LLMs, and even alter their reasoning capabilities.
arXiv Detail & Related papers (2026-01-12T18:15:50Z) - Profile-LLM: Dynamic Profile Optimization for Realistic Personality Expression in LLMs [11.672385046863655]
PersonaPulse is a framework that iteratively enhances role-play prompts while integrating a situational response benchmark as a scoring tool.<n> Quantitative evaluations demonstrate that the prompts generated by PersonaPulse outperform those of prior work.<n>For certain personality traits, the extent of personality evocation can be partially controlled by pausing the optimization process.
arXiv Detail & Related papers (2025-11-25T02:31:40Z) - A Comparative Study of Large Language Models and Human Personality Traits [6.354326674890978]
Large Language Models (LLMs) have demonstrated human-like capabilities in language comprehension and generation.<n>This study investigates whether LLMs exhibit personality-like traits and how these traits compare with human personality.
arXiv Detail & Related papers (2025-05-01T15:10:15Z) - Probing then Editing Response Personality of Large Language Models [40.99117085818623]
Large Language Models (LLMs) have demonstrated promising capabilities to generate responses that exhibit consistent personality traits.
We introduce a layer-wise probing framework to investigate the layer-wise capability of LLMs in encoding personality for responding.
We propose a layer-wise editing method to edit the personality expressed by LLMs during inference.
arXiv Detail & Related papers (2025-04-14T13:46:35Z) - Evaluating Personality Traits in Large Language Models: Insights from Psychological Questionnaires [3.6001840369062386]
This work applies psychological tools to Large Language Models in diverse scenarios to generate personality profiles.<n>Our findings reveal that LLMs exhibit unique traits, varying characteristics, and distinct personality profiles even within the same family of models.
arXiv Detail & Related papers (2025-02-07T16:12:52Z) - Assessing Social Alignment: Do Personality-Prompted Large Language Models Behave Like Humans? [9.771036970279765]
State-of-the-art approaches exploit a vast variety of training data, and prompt the model to adopt a particular personality.<n>We use classic psychological experiments, the Milgram experiment and the Ultimatum Game, as social interaction testbeds.<n>Our experiments reveal failure modes of the prompt-based modulation of the models' behavior that are shared across all models tested and persist under prompt perturbations.
arXiv Detail & Related papers (2024-12-21T20:58:19Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues [63.936654900356004]
Personality recognition aims to identify the personality traits implied in user data such as dialogues and social media posts.
We propose a novel task named Explainable Personality Recognition, aiming to reveal the reasoning process as supporting evidence of the personality trait.
arXiv Detail & Related papers (2024-09-29T14:41:43Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - ControlLM: Crafting Diverse Personalities for Language Models [32.411304295746746]
We introduce ControlLM, which leverages differential activation patterns, derived from contrasting behavioral prompts in the model's latent space, to influence the model's personality traits at inference.
First, we demonstrate ControlLM's capacity to elicit diverse persona behaviors without any training, while precision control allows personality traits to closely match average human values.
We showcase improved reasoning and question answering through selective amplification of beneficial attributes like conscientiousness and friendliness.
arXiv Detail & Related papers (2024-02-15T17:58:29Z) - Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation [71.91287418249688]
Large language models (LLMs) often struggle with factual inaccuracies, even when they hold relevant knowledge.
We leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality.
We show that the proposed self-alignment approach substantially enhances factual accuracy over Llama family models across three key knowledge-intensive tasks.
arXiv Detail & Related papers (2024-02-14T15:52:42Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation [52.043831554626685]
Personality is a crucial factor that shapes human communication patterns, thereby regulating the personalities of large language models (LLMs)<n>We propose UPLex, a method that uses an Unsupervisedly-Built personalized lexicon (UPL) during the decoding phase to manipulate LLM's personality traits.<n>UPLex can be constructed from a newly built situational judgment test dataset in an unsupervised fashion, and used to modulate the personality expression of LLMs.
arXiv Detail & Related papers (2023-10-25T12:16:33Z) - Editing Personality for Large Language Models [73.59001811199823]
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs)
We construct PersonalityEdit, a new benchmark dataset to address this task.
arXiv Detail & Related papers (2023-10-03T16:02:36Z) - Personality Traits in Large Language Models [42.31355340867784]
Personality is a key factor determining the effectiveness of communication.<n>We present a novel and comprehensive psychometrically valid and reliable methodology for administering and validating personality tests on widely-used large language models.<n>We discuss the application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.
arXiv Detail & Related papers (2023-07-01T00:58:51Z) - Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors.
To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors.
MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories.
We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.