Related papers: IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization

IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization

URL: http://arxiv.org/abs/2508.08719v1
Date: Tue, 12 Aug 2025 08:04:28 GMT
Title: IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
Authors: Yuzhuo Bai, Shitong Duan, Muhua Huang, Jing Yao, Zhenghao Liu, Peng Zhang, Tun Lu, Xiaoyuan Yi, Maosong Sun, Xing Xie,
Abstract summary: We propose IROTE, a novel in-context method for stable and transferable trait elicitation.<n>We show that one single IROTE-generated self-reflection can induce LLMs' stable impersonation of the target trait across diverse downstream tasks.
Score: 66.6349183886101
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Trained on various human-authored corpora, Large Language Models (LLMs) have demonstrated a certain capability of reflecting specific human-like traits (e.g., personality or values) by prompting, benefiting applications like personalized LLMs and social simulations. However, existing methods suffer from the superficial elicitation problem: LLMs can only be steered to mimic shallow and unstable stylistic patterns, failing to embody the desired traits precisely and consistently across diverse tasks like humans. To address this challenge, we propose IROTE, a novel in-context method for stable and transferable trait elicitation. Drawing on psychological theories suggesting that traits are formed through identity-related reflection, our method automatically generates and optimizes a textual self-reflection within prompts, which comprises self-perceived experience, to stimulate LLMs' trait-driven behavior. The optimization is performed by iteratively maximizing an information-theoretic objective that enhances the connections between LLMs' behavior and the target trait, while reducing noisy redundancy in reflection without any fine-tuning, leading to evocative and compact trait reflection. Extensive experiments across three human trait systems manifest that one single IROTE-generated self-reflection can induce LLMs' stable impersonation of the target trait across diverse downstream tasks beyond simple questionnaire answering, consistently outperforming existing strong baselines.

Related papers

Individual Turing Test: A Case Study of LLM-based Simulation Using Longitudinal Personal Data [54.145424717168794]
Large Language Models (LLMs) have demonstrated remarkable human-like capabilities, yet their ability to replicate a specific individual remains under-explored.<n>This paper presents a case study to investigate LLM-based individual simulation with a volunteer-contributed archive of private messaging history spanning over ten years.<n>We propose the "Individual Turing Test" to evaluate whether acquaintances of the volunteer can correctly identify which response in a multi-candidate pool most plausibly comes from the volunteer.
arXiv Detail & Related papers (2026-03-01T21:46:27Z)
Profile-LLM: Dynamic Profile Optimization for Realistic Personality Expression in LLMs [11.672385046863655]
PersonaPulse is a framework that iteratively enhances role-play prompts while integrating a situational response benchmark as a scoring tool.<n> Quantitative evaluations demonstrate that the prompts generated by PersonaPulse outperform those of prior work.<n>For certain personality traits, the extent of personality evocation can be partially controlled by pausing the optimization process.
arXiv Detail & Related papers (2025-11-25T02:31:40Z)
Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs [10.99947795031516]
Large Language Models exhibit implicit personalities in their generation, but reliably controlling or aligning these traits to meet specific needs remains an open challenge.<n>We propose a novel pipeline that extracts hidden state activations from transformer layers using the Big Five Personality Traits.<n>Our findings reveal that personality traits occupy a low-rank shared subspace, and that these latent structures can be transformed into actionable mechanisms for effective steering.
arXiv Detail & Related papers (2025-10-29T05:56:39Z)
Mind the Gap: The Divergence Between Human and LLM-Generated Tasks [12.96670500625407]
We compare human task generation with that of an agent powered by large language models (LLMs)<n>We find that human task generation is consistently influenced by psychological drivers, including personal values and cognitive style.<n>We conclude that there is a core gap between the value-driven, embodied nature of human cognition and the statistical patterns of LLMs.
arXiv Detail & Related papers (2025-08-01T03:00:41Z)
Evaluating the Simulation of Human Personality-Driven Susceptibility to Misinformation with LLMs [0.18416014644193066]
Large language models (LLMs) make it possible to generate synthetic behavioural data at scale.<n>We evaluate the capacity of LLM agents, conditioned on Big-Five profiles, to reproduce personality-based variation in susceptibility to misinformation.
arXiv Detail & Related papers (2025-06-30T08:16:07Z)
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment [88.56809269990625]
We propose a bilevel objective optimistically biased towards potentially high-reward responses to actively explore out-of-distribution regions. Our experimental results demonstrate that when fine-tuned on Zephyr-7B-SFT and Llama-3-8B-Instruct models, Self-Exploring Language Models (SELM) significantly boosts the performance on instruction-following benchmarks.
arXiv Detail & Related papers (2024-05-29T17:59:07Z)
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting [5.344199202349884]
We analyze the structure of modalities within both two types of Large Language Models and six task-specific channels during deployment. We examine the stimulation of diverse cognitive behaviors in LLMs through the adoption of free-form text and verbal contexts.
arXiv Detail & Related papers (2024-05-17T00:19:41Z)
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks. We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z)
Tailoring Personality Traits in Large Language Models via Unsupervisedly-Built Personalized Lexicons [42.66142331217763]
Personality plays a pivotal role in shaping human expression patterns. Previous methods relied on fine-tuning large language models (LLMs) on specific corpora. We have employed a novel Unsupervisedly-Built personalized lexicon (UBPL) in a pluggable manner to manipulate personality traits.
arXiv Detail & Related papers (2023-10-25T12:16:33Z)
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG) Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection. It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z)
SELF: Self-Evolution with Language Feedback [68.6673019284853]
'SELF' (Self-Evolution with Language Feedback) is a novel approach to advance large language models. It enables LLMs to self-improve through self-reflection, akin to human learning processes. Our experiments in mathematics and general tasks demonstrate that SELF can enhance the capabilities of LLMs without human intervention.
arXiv Detail & Related papers (2023-10-01T00:52:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.