Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing?
- URL: http://arxiv.org/abs/2404.12138v1
- Date: Thu, 18 Apr 2024 12:40:59 GMT
- Title: Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing?
- Authors: Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao,
- Abstract summary: We benchmark the ability of Large Language Models in persona-driven decision-making.
We investigate whether LLMs can predict characters' decisions provided with the preceding stories in high-quality novels.
The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet there is substantial room for improvement.
- Score: 59.0123596591807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Can Large Language Models substitute humans in making important decisions? Recent research has unveiled the potential of LLMs to role-play assigned personas, mimicking their knowledge and linguistic habits. However, imitative decision-making requires a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided with the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,401 character decision points from 395 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and methods for LLM role-playing. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet there is substantial room for improvement. Hence, we further propose the CHARMAP method, which achieves a 6.01% increase in accuracy via persona-based memory retrieval. We will make our datasets and code publicly available.
Related papers
- Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization [33.513689684998035]
The concept of persona, originally adopted in dialogue literature, has re-surged as a promising framework for tailoring large language models to specific context.
To close the gap, we present a comprehensive survey to categorize the current state of the field.
arXiv Detail & Related papers (2024-06-03T10:08:23Z) - Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works [33.817319226631426]
Large language models (LLMs) have demonstrated impressive performance and spurred numerous AI applications.
The prerequisite for these RPAs lies in the capability of LLMs to understand characters from fictional works.
Previous efforts have evaluated this capability via basic classification tasks or characteristic imitation.
arXiv Detail & Related papers (2024-04-19T09:10:29Z) - Can Language Models Recognize Convincing Arguments? [12.458437450959416]
Large language models (LLMs) have raised concerns about their potential to create and propagate convincing narratives.
We study their performance in detecting convincing arguments to gain insights into their persuasive capabilities.
arXiv Detail & Related papers (2024-03-31T17:38:33Z) - On the Decision-Making Abilities in Role-Playing using Large Language
Models [6.550638804145713]
Large language models (LLMs) are increasingly utilized for role-playing tasks.
This paper focuses on evaluating the decision-making abilities of LLMs post role-playing.
arXiv Detail & Related papers (2024-02-29T02:22:23Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Introspective Tips: Large Language Model for In-Context Decision Making [48.96711664648164]
We employ Introspective Tips" to facilitate large language models (LLMs) in self-optimizing their decision-making.
Our method enhances the agent's performance in both few-shot and zero-shot learning situations.
Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.
arXiv Detail & Related papers (2023-05-19T11:20:37Z) - Statistical Knowledge Assessment for Large Language Models [79.07989821512128]
Given varying prompts regarding a factoid question, can a large language model (LLM) reliably generate factually correct answers?
We propose KaRR, a statistical approach to assess factual knowledge for LLMs.
Our results reveal that the knowledge in LLMs with the same backbone architecture adheres to the scaling law, while tuning on instruction-following data sometimes compromises the model's capability to generate factually correct text reliably.
arXiv Detail & Related papers (2023-05-17T18:54:37Z) - ElitePLM: An Empirical Study on General Language Ability Evaluation of
Pretrained Language Models [78.08792285698853]
We present a large-scale empirical study on general language ability evaluation of pretrained language models (ElitePLM)
Our empirical results demonstrate that: (1) PLMs with varying training objectives and strategies are good at different ability tests; (2) fine-tuning PLMs in downstream tasks is usually sensitive to the data size and distribution; and (3) PLMs have excellent transferability between similar tasks.
arXiv Detail & Related papers (2022-05-03T14:18:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.