Personality Vector: Modulating Personality of Large Language Models by Model Merging
- URL: http://arxiv.org/abs/2509.19727v1
- Date: Wed, 24 Sep 2025 03:11:28 GMT
- Title: Personality Vector: Modulating Personality of Large Language Models by Model Merging
- Authors: Seungjong Sun, Seo Yeon Baek, Jang Hyun Kim,
- Abstract summary: We propose a novel method for personality modulation in large language models (LLMs)<n>We construct personality vectors by subtracting the weights of a pre-trained model from those of the fine-tuned model on a given personality trait.<n>Experiments show that personality vectors enable continuous control over trait intensity and support the composition of multiple traits.
- Score: 2.375715682799016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Driven by the demand for personalized AI systems, there is growing interest in aligning the behavior of large language models (LLMs) with human traits such as personality. Previous attempts to induce personality in LLMs have shown promising results, but they struggle to capture the continuous and multidimensional nature of human traits. In this work, we propose a novel method for personality modulation in LLMs via model merging. Specifically, we construct personality vectors by subtracting the weights of a pre-trained model from those of the fine-tuned model on a given personality trait. By merging personality vectors, we enable LLMs to exhibit desired personality traits without additional training. Extensive experiments show that personality vectors enable continuous control over trait intensity and support the composition of multiple traits. Furthermore, personality vectors transfer across diverse downstream models, suggesting that they encode generalizable representations of personality. Our code is available at here.
Related papers
- PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra [84.59328460968872]
Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning.<n>We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors.<n>On PersonalityBench, our approach achieves a mean score of 9.60, nearly matching the supervised fine-tuning upper bound of 9.61 without any gradient updates.
arXiv Detail & Related papers (2026-02-17T15:47:58Z) - Evaluating Personality Traits in Large Language Models: Insights from Psychological Questionnaires [3.6001840369062386]
This work applies psychological tools to Large Language Models in diverse scenarios to generate personality profiles.<n>Our findings reveal that LLMs exhibit unique traits, varying characteristics, and distinct personality profiles even within the same family of models.
arXiv Detail & Related papers (2025-02-07T16:12:52Z) - Orca: Enhancing Role-Playing Abilities of Large Language Models by Integrating Personality Traits [4.092862870428798]
We propose Orca, a framework for data processing and training LLMs of custom characters by integrating personality traits.
Orca comprises four stages: Personality traits inferring, leverage LLMs to infer user's BigFive personality trait reports and scores.
Our experiments demonstrate that our proposed model achieves superior performance on this benchmark.
arXiv Detail & Related papers (2024-11-15T07:35:47Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - P-React: Synthesizing Topic-Adaptive Reactions of Personality Traits via Mixture of Specialized LoRA Experts [34.374681921626205]
We propose P-React, a mixture of experts (MoE)-based personalized large language models.<n> Particularly, we integrate a Personality Loss (PSL) to better capture individual trait expressions.<n>To facilitate research in this field, we curate OCEAN-Chat, a high-quality, human-verified dataset.
arXiv Detail & Related papers (2024-06-18T12:25:13Z) - LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - ControlLM: Crafting Diverse Personalities for Language Models [32.411304295746746]
We introduce ControlLM, which leverages differential activation patterns, derived from contrasting behavioral prompts in the model's latent space, to influence the model's personality traits at inference.
First, we demonstrate ControlLM's capacity to elicit diverse persona behaviors without any training, while precision control allows personality traits to closely match average human values.
We showcase improved reasoning and question answering through selective amplification of beneficial attributes like conscientiousness and friendliness.
arXiv Detail & Related papers (2024-02-15T17:58:29Z) - Eliciting Personality Traits in Large Language Models [0.0]
Large Language Models (LLMs) are increasingly being utilized by both candidates and employers in the recruitment context.
This study seeks to obtain a better understanding of such models by examining their output variations based on different input prompts.
arXiv Detail & Related papers (2024-02-13T10:09:00Z) - LLMs Simulate Big Five Personality Traits: Further Evidence [51.13560635563004]
We analyze the personality traits simulated by Llama2, GPT4, and Mixtral.
This contributes to the broader understanding of the capabilities of LLMs to simulate personality traits.
arXiv Detail & Related papers (2024-01-31T13:45:25Z) - UPLex: Fine-Grained Personality Control in Large Language Models via Unsupervised Lexical Modulation [52.043831554626685]
Personality is a crucial factor that shapes human communication patterns, thereby regulating the personalities of large language models (LLMs)<n>We propose UPLex, a method that uses an Unsupervisedly-Built personalized lexicon (UPL) during the decoding phase to manipulate LLM's personality traits.<n>UPLex can be constructed from a newly built situational judgment test dataset in an unsupervised fashion, and used to modulate the personality expression of LLMs.
arXiv Detail & Related papers (2023-10-25T12:16:33Z) - Editing Personality for Large Language Models [73.59001811199823]
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs)
We construct PersonalityEdit, a new benchmark dataset to address this task.
arXiv Detail & Related papers (2023-10-03T16:02:36Z) - Personality Traits in Large Language Models [42.31355340867784]
Personality is a key factor determining the effectiveness of communication.<n>We present a novel and comprehensive psychometrically valid and reliable methodology for administering and validating personality tests on widely-used large language models.<n>We discuss the application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.
arXiv Detail & Related papers (2023-07-01T00:58:51Z) - Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors.
To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors.
MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories.
We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.