Related papers: Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions?

Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions?

URL: http://arxiv.org/abs/2404.12138v2
Date: Mon, 18 Nov 2024 11:29:47 GMT
Title: Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions?
Authors: Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao,
Abstract summary: We benchmark the ability of Large Language Models (LLMs) in persona-driven decision-making. We investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains.
Score: 59.0123596591807
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Can Large Language Models (LLMs) simulate humans in making important decisions? Recent research has unveiled the potential of using LLMs to develop role-playing language agents (RPLAs), mimicking mainly the knowledge and tones of various characters. However, imitative decision-making necessitates a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,462 characters' decision points from 388 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and RPLA methodologies. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains. Hence, we further propose the CHARMAP method, which adopts persona-based memory retrieval and significantly advances RPLAs on this task, achieving 5.03% increase in accuracy.

Related papers

Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs [50.0874045899661]
We introduce CharacterBot, a model designed to replicate both the linguistic patterns and distinctive thought processes of a character. Using Lu Xun as a case study, we propose four training tasks derived from his 17 essay collections. These include a pre-training task focused on mastering external linguistic structures and knowledge, as well as three fine-tuning tasks. We evaluate CharacterBot on three tasks for linguistic accuracy and opinion comprehension, demonstrating that it significantly outperforms the baselines on our adapted metrics.
arXiv Detail & Related papers (2025-02-18T16:11:54Z)
Potential and Perils of Large Language Models as Judges of Unstructured Textual Data [0.631976908971572]
This research investigates the effectiveness of LLM-as-judge models to evaluate the thematic alignment of summaries generated by other LLMs. Our findings reveal that while LLM-as-judge offer a scalable solution comparable to human raters, humans may still excel at detecting subtle, context-specific nuances.
arXiv Detail & Related papers (2025-01-14T14:49:14Z)
Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works [33.817319226631426]
Large language models (LLMs) have demonstrated impressive performance and spurred numerous AI applications. The prerequisite for these RPAs lies in the capability of LLMs to understand characters from fictional works. Previous efforts have evaluated this capability via basic classification tasks or characteristic imitation.
arXiv Detail & Related papers (2024-04-19T09:10:29Z)
Can Language Models Recognize Convincing Arguments? [12.458437450959416]
Large language models (LLMs) have raised concerns about their potential to create and propagate convincing narratives. We study their performance in detecting convincing arguments to gain insights into their persuasive capabilities.
arXiv Detail & Related papers (2024-03-31T17:38:33Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
On the Decision-Making Abilities in Role-Playing using Large Language Models [6.550638804145713]
Large language models (LLMs) are increasingly utilized for role-playing tasks. This paper focuses on evaluating the decision-making abilities of LLMs post role-playing.
arXiv Detail & Related papers (2024-02-29T02:22:23Z)
Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z)
Exploring the Sensitivity of LLMs' Decision-Making Capabilities: Insights from Prompt Variation and Hyperparameters [6.00842499449049]
We study how Large Language Models respond to variations in prompts and hyper parameters. By experimenting on three OpenAI language models possessing different capabilities, we observe that the decision making abilities fluctuate based on the input prompts and temperature settings. Contrary to previous findings language models display a human-like exploration exploitation tradeoff after simple adjustments to the prompt.
arXiv Detail & Related papers (2023-12-29T05:19:11Z)
Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information. This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z)
Introspective Tips: Large Language Model for In-Context Decision Making [48.96711664648164]
We employ Introspective Tips" to facilitate large language models (LLMs) in self-optimizing their decision-making. Our method enhances the agent's performance in both few-shot and zero-shot learning situations. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.
arXiv Detail & Related papers (2023-05-19T11:20:37Z)
Statistical Knowledge Assessment for Large Language Models [79.07989821512128]
Given varying prompts regarding a factoid question, can a large language model (LLM) reliably generate factually correct answers? We propose KaRR, a statistical approach to assess factual knowledge for LLMs. Our results reveal that the knowledge in LLMs with the same backbone architecture adheres to the scaling law, while tuning on instruction-following data sometimes compromises the model's capability to generate factually correct text reliably.
arXiv Detail & Related papers (2023-05-17T18:54:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.