Related papers: Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent

Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent

URL: http://arxiv.org/abs/2507.16799v2
Date: Wed, 23 Jul 2025 06:06:43 GMT
Title: Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent
Authors: Xiaoyu Zhan, Xinyu Fu, Hao Sun, Yuanqi Li, Jie Guo, Yanwen Guo,
Abstract summary: Test-Time-Matching (TTM) is a training-free role-playing framework through test-time scaling and context engineering.<n>Our framework involves a structured, three-stage generation pipeline that utilizes these features for controlled role-playing.<n>It achieves high-fidelity role-playing performance, also enables seamless combinations across diverse linguistic styles and even variations in personality and memory.
Score: 18.67432557362308
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid advancement of large language models (LLMs) has enabled role-playing language agents to demonstrate significant potential in various applications. However, relying solely on prompts and contextual inputs often proves insufficient for achieving deep immersion in specific roles, particularly well-known fictional or public figures. On the other hand, fine-tuning-based approaches face limitations due to the challenges associated with data collection and the computational resources required for training, thereby restricting their broader applicability. To address these issues, we propose Test-Time-Matching (TTM), a training-free role-playing framework through test-time scaling and context engineering. TTM uses LLM agents to automatically decouple a character's features into personality, memory, and linguistic style. Our framework involves a structured, three-stage generation pipeline that utilizes these features for controlled role-playing. It achieves high-fidelity role-playing performance, also enables seamless combinations across diverse linguistic styles and even variations in personality and memory. We evaluate our framework through human assessment, and the results demonstrate that our method achieves the outstanding performance in generating expressive and stylistically consistent character dialogues.

Related papers

SpeechRole: A Large-Scale Dataset and Benchmark for Evaluating Speech Role-Playing Agents [52.29009595100625]
Role-playing agents have emerged as a promising paradigm for achieving personalized interaction and emotional resonance.<n>Existing research primarily focuses on the textual modality, neglecting the critical dimension of speech in realistic interactive scenarios.<n>We construct SpeechRole-Data, a large-scale, high-quality dataset that comprises 98 diverse roles and 112k speech-based single-turn and multi-turn conversations.
arXiv Detail & Related papers (2025-08-04T03:18:36Z)
RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing [111.06936588273868]
RMTBench is a comprehensive textbfuser-centric bilingual role-playing benchmark featuring 80 diverse characters and over 8,000 dialogue rounds.<n>Our benchmark constructs dialogues based on explicit user motivations rather than character descriptions, ensuring alignment with practical user applications.<n>By shifting focus from character background to user intention fulfillment, RMTBench bridges the gap between academic evaluation and practical deployment requirements.
arXiv Detail & Related papers (2025-07-27T16:49:47Z)
A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations [112.81207927088117]
PersonaConvBench is a benchmark for evaluating personalized reasoning and generation in multi-turn conversations with large language models (LLMs)<n>We benchmark several commercial and open-source LLMs under a unified prompting setup and observe that incorporating personalized history yields substantial performance improvements.
arXiv Detail & Related papers (2025-05-20T09:13:22Z)
Enhancing Persona Consistency for LLMs' Role-Playing using Persona-Aware Contrastive Learning [7.836439251883518]
We propose a novel framework named textbfunderlinePersona-Aware textbfunderlineContrastive textbfunderlineLearning (PCL) to align model role-playing behavior.<n>We show that PCL significantly outperform vanilla LLMs under automatic evaluation methods and human expert evaluation.
arXiv Detail & Related papers (2025-03-22T06:12:34Z)
A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles [28.237927070779925]
Current Role-Playing Agents (RPAs) predominantly focus on mimicking a character's fundamental attributes while neglecting the replication of linguistic style. We develop StyleRPA, a Multi-Task Role-Playing Agent (MRPA) that significantly outperforms recent open-source LLMs and RPAs baselines on 7 tasks including Dialogue, Dictionary, Composition, Story Generation, Product Description, Music Commentary, and Open Question Answering.
arXiv Detail & Related papers (2024-11-04T02:26:27Z)
Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data [58.92110996840019]
We propose to enhance role-playing language models (RPLMs) via personality-indicative data. Specifically, we leverage questions from psychological scales and distill advanced RPAs to generate dialogues that grasp the minds of characters. Experimental results validate that RPLMs trained with our dataset exhibit advanced role-playing capabilities for both general and personality-related evaluations.
arXiv Detail & Related papers (2024-06-27T06:24:00Z)
Role-playing Prompt Framework: Generation and Evaluation [3.2845546753303867]
Large language models (LLMs) exhibit impressive proficiency in natural language generation, understanding user instructions, and emulating human-like language use.<n>This paper introduces a prompt-based framework designed to leverage GPT's capabilities for the generation of role-playing dialogue datasets.
arXiv Detail & Related papers (2024-06-02T06:09:56Z)
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment [62.898963074989766]
We introduce Ditto, a self-alignment method for role-play. This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold. We present the first comprehensive cross-supervision alignment experiment in the role-play domain.
arXiv Detail & Related papers (2024-01-23T03:56:22Z)
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs) Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.