Personality Editing for Language Models through Relevant Knowledge Editing
- URL: http://arxiv.org/abs/2502.11789v1
- Date: Mon, 17 Feb 2025 13:28:14 GMT
- Title: Personality Editing for Language Models through Relevant Knowledge Editing
- Authors: Seojin Hwang, Yumin Kim, Byeongjeong Kim, Hwanhee Lee,
- Abstract summary: Large Language Models (LLMs) play a vital role in applications like conversational agents and content creation.
We introduce a novel method PALETTE that enhances personality control through knowledge editing.
- Score: 3.6152232645741025
- License:
- Abstract: Large Language Models (LLMs) play a vital role in applications like conversational agents and content creation, where controlling a model's personality is crucial for maintaining tone, consistency, and engagement. However, traditional prompt-based techniques for controlling personality often fall short, as they do not effectively mitigate the model's inherent biases. In this paper, we introduce a novel method PALETTE that enhances personality control through knowledge editing. By generating adjustment queries inspired by psychological assessments, our approach systematically adjusts responses to personality-related queries similar to modifying factual knowledge, thereby achieving controlled shifts in personality traits. Experimental results from both automatic and human evaluations demonstrate that our method enables more stable and well-balanced personality control in LLMs.
Related papers
- Evaluating Personality Traits in Large Language Models: Insights from Psychological Questionnaires [3.6001840369062386]
This work applies psychological tools to Large Language Models in diverse scenarios to generate personality profiles.
Our findings reveal that LLMs exhibit unique traits, varying characteristics, and distinct personality profiles even within the same family of models.
arXiv Detail & Related papers (2025-02-07T16:12:52Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - The Effects of Embodiment and Personality Expression on Learning in LLM-based Educational Agents [0.7499722271664147]
This work investigates how personality expression and embodiment affect personality perception and learning in educational conversational agents.
We extend an existing personality-driven conversational agent framework by integrating LLM-based conversation support tailored to an educational application.
For each personality style, we assess three models: (1) a dialogue-only model that conveys personality through dialogue, (2) an animated human model that expresses personality solely through dialogue, and (3) an animated human model that expresses personality through both dialogue and body and facial animations.
arXiv Detail & Related papers (2024-06-24T09:38:26Z) - LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - ControlLM: Crafting Diverse Personalities for Language Models [32.411304295746746]
We introduce ControlLM, which leverages differential activation patterns, derived from contrasting behavioral prompts in the model's latent space, to influence the model's personality traits at inference.
First, we demonstrate ControlLM's capacity to elicit diverse persona behaviors without any training, while precision control allows personality traits to closely match average human values.
We showcase improved reasoning and question answering through selective amplification of beneficial attributes like conscientiousness and friendliness.
arXiv Detail & Related papers (2024-02-15T17:58:29Z) - Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation [71.91287418249688]
Large language models (LLMs) often struggle with factual inaccuracies, even when they hold relevant knowledge.
We leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality.
We show that the proposed self-alignment approach substantially enhances factual accuracy over Llama family models across three key knowledge-intensive tasks.
arXiv Detail & Related papers (2024-02-14T15:52:42Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Editing Personality for Large Language Models [73.59001811199823]
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs)
We construct PersonalityEdit, a new benchmark dataset to address this task.
arXiv Detail & Related papers (2023-10-03T16:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.