On the Decision-Making Abilities in Role-Playing using Large Language
Models
- URL: http://arxiv.org/abs/2402.18807v1
- Date: Thu, 29 Feb 2024 02:22:23 GMT
- Title: On the Decision-Making Abilities in Role-Playing using Large Language
Models
- Authors: Chenglei Shen and Guofu Xie and Xiao Zhang and Jun Xu
- Abstract summary: Large language models (LLMs) are increasingly utilized for role-playing tasks.
This paper focuses on evaluating the decision-making abilities of LLMs post role-playing.
- Score: 6.550638804145713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are now increasingly utilized for role-playing
tasks, especially in impersonating domain-specific experts, primarily through
role-playing prompts. When interacting in real-world scenarios, the
decision-making abilities of a role significantly shape its behavioral
patterns. In this paper, we concentrate on evaluating the decision-making
abilities of LLMs post role-playing thereby validating the efficacy of
role-playing. Our goal is to provide metrics and guidance for enhancing the
decision-making abilities of LLMs in role-playing tasks. Specifically, we first
use LLMs to generate virtual role descriptions corresponding to the 16
personality types of Myers-Briggs Type Indicator (abbreviated as MBTI)
representing a segmentation of the population. Then we design specific
quantitative operations to evaluate the decision-making abilities of LLMs post
role-playing from four aspects: adaptability, exploration$\&$exploitation
trade-off ability, reasoning ability, and safety. Finally, we analyze the
association between the performance of decision-making and the corresponding
MBTI types through GPT-4. Extensive experiments demonstrate stable differences
in the four aspects of decision-making abilities across distinct roles,
signifying a robust correlation between decision-making abilities and the roles
emulated by LLMs. These results underscore that LLMs can effectively
impersonate varied roles while embodying their genuine sociological
characteristics.
Related papers
- Understanding the Role of LLMs in Multimodal Evaluation Benchmarks [77.59035801244278]
This paper investigates the role of the Large Language Model (LLM) backbone in Multimodal Large Language Models (MLLMs) evaluation.
Our study encompasses four diverse MLLM benchmarks and eight state-of-the-art MLLMs.
Key findings reveal that some benchmarks allow high performance even without visual inputs and up to 50% of error rates can be attributed to insufficient world knowledge in the LLM backbone.
arXiv Detail & Related papers (2024-10-16T07:49:13Z) - Bias and Toxicity in Role-Play Reasoning [6.868242720276291]
Role-play in the Large Language Model (LLM) is a crucial technique that enables models to adopt specific perspectives.
We demonstrate that role-play also carries potential risks.
arXiv Detail & Related papers (2024-09-21T02:09:13Z) - Thinking Before Speaking: A Role-playing Model with Mindset [0.6428333375712125]
Large Language Models (LLMs) are skilled at simulating human behaviors.
These models tend to perform poorly when confronted with knowledge that the assumed role does not possess.
We propose a Thinking Before Speaking (TBS) model in this paper.
arXiv Detail & Related papers (2024-09-14T02:41:48Z) - Meta Reasoning for Large Language Models [58.87183757029041]
We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs)
MRP guides LLMs to dynamically select and apply different reasoning methods based on the specific requirements of each task.
We evaluate the effectiveness of MRP through comprehensive benchmarks.
arXiv Detail & Related papers (2024-06-17T16:14:11Z) - Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing? [59.0123596591807]
We benchmark the ability of Large Language Models in persona-driven decision-making.
We investigate whether LLMs can predict characters' decisions provided with the preceding stories in high-quality novels.
The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet there is substantial room for improvement.
arXiv Detail & Related papers (2024-04-18T12:40:59Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) can estimate causal effects under interventions on different parts of a system.
We conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.
We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - Determinants of LLM-assisted Decision-Making [0.0]
Large Language Models (LLMs) provide multifaceted support in enhancing human decision-making processes.
This study provides a structural overview and detailed analysis of determinants impacting decision-making with LLM support.
Our findings can be seen as crucial for improving decision quality in human-AI collaboration.
arXiv Detail & Related papers (2024-02-27T10:24:50Z) - Large Language Models are Superpositions of All Characters: Attaining
Arbitrary Role-play via Self-Alignment [62.898963074989766]
We introduce Ditto, a self-alignment method for role-play.
This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold.
We present the first comprehensive cross-supervision alignment experiment in the role-play domain.
arXiv Detail & Related papers (2024-01-23T03:56:22Z) - RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models [107.00832724504752]
We introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in Large Language Models (LLMs)
By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples.
arXiv Detail & Related papers (2023-10-01T17:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.