Exploring the Personality Traits of LLMs through Latent Features Steering
- URL: http://arxiv.org/abs/2410.10863v2
- Date: Sun, 16 Feb 2025 22:19:15 GMT
- Title: Exploring the Personality Traits of LLMs through Latent Features Steering
- Authors: Shu Yang, Shenzhe Zhu, Liang Liu, Lijie Hu, Mengdi Li, Di Wang,
- Abstract summary: We investigate how factors, such as cultural norms and environmental stressors, encoded within large language models (LLMs) shape their personality traits.<n>We propose a training-free approach to modify the model's behavior by extracting and steering latent features corresponding to factors within the model.
- Score: 12.142248881876355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have significantly advanced dialogue systems and role-playing agents through their ability to generate human-like text. While prior studies have shown that LLMs can exhibit distinct and consistent personalities, the mechanisms through which these models encode and express specific personality traits remain poorly understood. To address this, we investigate how various factors, such as cultural norms and environmental stressors, encoded within LLMs, shape their personality traits, guided by the theoretical framework of social determinism. Inspired by related work on LLM interpretability, we propose a training-free approach to modify the model's behavior by extracting and steering latent features corresponding to factors within the model, thereby eliminating the need for retraining. Furthermore, we analyze the implications of these factors for model safety, focusing on their impact through the lens of personality.
Related papers
- An Overview of Large Language Models for Statisticians [109.38601458831545]
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI)
This paper explores potential areas where statisticians can make important contributions to the development of LLMs.
We focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation.
arXiv Detail & Related papers (2025-02-25T03:40:36Z) - Investigating the Zone of Proximal Development of Language Models for In-Context Learning [59.91708683601029]
We introduce a learning analytics framework to analyze the in-context learning (ICL) behavior of large language models (LLMs)
We adapt the Zone of Proximal Development (ZPD) theory to ICL, measuring the ZPD of LLMs based on model performance on individual examples.
Our findings reveal a series of intricate and multifaceted behaviors of ICL, providing new insights into understanding and leveraging this technique.
arXiv Detail & Related papers (2025-02-10T19:36:21Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - An LLM Feature-based Framework for Dialogue Constructiveness Assessment [8.87747076871578]
Research on dialogue constructiveness assessment focuses on (i) analysing conversational factors that influence individuals to take specific actions, win debates, change their perspectives or broaden their open-mindedness and (ii) predicting constructiveness outcomes following dialogues for such use cases.
These objectives can be achieved by training either interpretable feature-based models or neural models such as pre-trained language models.
We propose an LLM feature-based framework for dialogue constructiveness assessment that combines the strengths of feature-based and neural approaches.
arXiv Detail & Related papers (2024-06-20T22:10:52Z) - Is persona enough for personality? Using ChatGPT to reconstruct an agent's latent personality from simple descriptions [2.6080756513915824]
Personality, a fundamental aspect of human cognition, contains a range of traits that influence behaviors, thoughts, and emotions.
This paper explores the capabilities of large language models (LLMs) in reconstructing these complex cognitive attributes based only on simple descriptions containing socio-demographic and personality type information.
arXiv Detail & Related papers (2024-06-18T02:32:57Z) - Explaining Large Language Models Decisions Using Shapley Values [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes.
However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain.
This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z) - PHAnToM: Persona-based Prompting Has An Effect on Theory-of-Mind Reasoning in Large Language Models [25.657579792829743]
We empirically evaluate how role-playing prompting influences Theory-of-Mind (ToM) reasoning capabilities.
We propose the mechanism that, beyond the inherent variance in the complexity of reasoning tasks, performance differences arise because of socially-motivated prompting differences.
arXiv Detail & Related papers (2024-03-04T17:34:34Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - Is Cognition and Action Consistent or Not: Investigating Large Language
Model's Personality [12.162460438332152]
We investigate the reliability of Large Language Models (LLMs) in professing human-like personality traits through responses to personality questionnaires.
Our goal is to evaluate the consistency between LLMs' professed personality inclinations and their actual "behavior"
We propose hypotheses for the observed results based on psychological theories and metrics.
arXiv Detail & Related papers (2024-02-22T16:32:08Z) - Systematic Biases in LLM Simulations of Debates [12.933509143906141]
We study the limitations of Large Language Models in simulating human interactions.
Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases.
These results underscore the need for further research to develop methods that help agents overcome these biases.
arXiv Detail & Related papers (2024-02-06T14:51:55Z) - LLMs Simulate Big Five Personality Traits: Further Evidence [51.13560635563004]
We analyze the personality traits simulated by Llama2, GPT4, and Mixtral.
This contributes to the broader understanding of the capabilities of LLMs to simulate personality traits.
arXiv Detail & Related papers (2024-01-31T13:45:25Z) - Personality Traits in Large Language Models [44.908741466152215]
Personality is a key factor determining the effectiveness of communication.
We present a comprehensive method for administering and validating personality tests on widely-used large language models.
We discuss application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.
arXiv Detail & Related papers (2023-07-01T00:58:51Z) - Revealing the structure of language model capabilities [4.037009782513272]
We analyzed data from 29 different large language models across 27 cognitive tasks.
Results reveal a consistent structure in the capabilities of different LLMs.
We suggest that benchmarks could be streamlined by focusing on tasks that tap into each broad model ability.
arXiv Detail & Related papers (2023-06-14T15:43:25Z) - Revisiting the Reliability of Psychological Scales on Large Language Models [62.57981196992073]
This study aims to determine the reliability of applying personality assessments to Large Language Models.
Analysis of 2,500 settings per model, including GPT-3.5, GPT-4, Gemini-Pro, and LLaMA-3.1, reveals that various LLMs show consistency in responses to the Big Five Inventory.
arXiv Detail & Related papers (2023-05-31T15:03:28Z) - Influence of External Information on Large Language Models Mirrors
Social Cognitive Patterns [51.622612759892775]
Social cognitive theory explains how people learn and acquire knowledge through observing others.
Recent years have witnessed the rapid development of large language models (LLMs)
LLMs, as AI agents, can observe external information, which shapes their cognition and behaviors.
arXiv Detail & Related papers (2023-05-08T16:10:18Z) - Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors.
To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors.
MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories.
We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.