Prompt Framework for Role-playing: Generation and Evaluation
- URL: http://arxiv.org/abs/2406.00627v2
- Date: Fri, 22 Nov 2024 06:19:35 GMT
- Title: Prompt Framework for Role-playing: Generation and Evaluation
- Authors: Xun Liu, Zhengwei Ni,
- Abstract summary: Large language models (LLMs) exhibit impressive proficiency in natural language generation, understanding user instructions, and emulating human-like language use.
This project introduces a prompt-based framework designed to leverage GPT's capabilities for the generation of role-playing dialogue datasets.
- Score: 3.2845546753303867
- License:
- Abstract: Large language models (LLMs) exhibit impressive proficiency in natural language generation, understanding user instructions, and emulating human-like language use, which has led to significant interest in their application to role-playing scenarios. However, the manual collection of role-specific script data and the evaluation of model performance are resource-intensive processes. This project introduces a prompt-based framework designed to leverage GPT's capabilities for the generation of role-playing dialogue datasets and the evaluation of role-playing performance. To validate the effectiveness of the GPT-based generation and evaluation, we further incorporate the recall-oriented Rouge-L metric, providing an additional quantitative measure of performance.
Related papers
- Towards More Effective Table-to-Text Generation: Assessing In-Context Learning and Self-Evaluation with Open-Source Models [0.0]
This study explores the effectiveness of various in-context learning strategies in language models (LMs) across benchmark datasets.
We employ a large language model (LLM) self-evaluation approach using chain-of-thought reasoning and assess its correlation with human-aligned metrics like BERTScore.
Our findings highlight the significant impact of examples in improving table-to-text generation and suggest that, while LLM self-evaluation has potential, its current alignment with human judgment could be enhanced.
arXiv Detail & Related papers (2024-10-15T09:19:42Z) - ERABAL: Enhancing Role-Playing Agents through Boundary-Aware Learning [17.5855800570993]
Role-playing is an emerging application in the field of Human-Computer Interaction (HCI)
Despite significant progress, role-playing agents (RPLAs) still struggle with maintaining role-consistency across conversations.
We present ERABAL, a framework aimed at enhancing RPLAs' role-playing capabilities through boundary-aware learning.
arXiv Detail & Related papers (2024-09-23T05:12:13Z) - Systematic Task Exploration with LLMs: A Study in Citation Text Generation [63.50597360948099]
Large language models (LLMs) bring unprecedented flexibility in defining and executing complex, creative natural language generation (NLG) tasks.
We propose a three-component research framework that consists of systematic input manipulation, reference data, and output measurement.
We use this framework to explore citation text generation -- a popular scholarly NLP task that lacks consensus on the task definition and evaluation metric.
arXiv Detail & Related papers (2024-07-04T16:41:08Z) - Evaluation of Instruction-Following Ability for Large Language Models on Story-Ending Generation [2.4889060833127665]
In this paper, we focus on evaluating the instruction-following ability of Large Language Models (LLMs) in the context of story-ending generation.
We propose an automatic evaluation pipeline that utilizes a machine reading comprehension (MRC) model to determine whether the generated story-ending reflects instruction.
arXiv Detail & Related papers (2024-06-24T06:53:36Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - Large Language Models are Superpositions of All Characters: Attaining
Arbitrary Role-play via Self-Alignment [62.898963074989766]
We introduce Ditto, a self-alignment method for role-play.
This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold.
We present the first comprehensive cross-supervision alignment experiment in the role-play domain.
arXiv Detail & Related papers (2024-01-23T03:56:22Z) - RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models [107.00832724504752]
We introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in Large Language Models (LLMs)
By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples.
arXiv Detail & Related papers (2023-10-01T17:52:59Z) - Multi-Dimensional Evaluation of Text Summarization with In-Context
Learning [79.02280189976562]
In this paper, we study the efficacy of large language models as multi-dimensional evaluators using in-context learning.
Our experiments show that in-context learning-based evaluators are competitive with learned evaluation frameworks for the task of text summarization.
We then analyze the effects of factors such as the selection and number of in-context examples on performance.
arXiv Detail & Related papers (2023-06-01T23:27:49Z) - Large Language Models are Diverse Role-Players for Summarization
Evaluation [82.31575622685902]
A document summary's quality can be assessed by human annotators on various criteria, both objective ones like grammar and correctness, and subjective ones like informativeness, succinctness, and appeal.
Most of the automatic evaluation methods like BLUE/ROUGE may be not able to adequately capture the above dimensions.
We propose a new evaluation framework based on LLMs, which provides a comprehensive evaluation framework by comparing generated text and reference text from both objective and subjective aspects.
arXiv Detail & Related papers (2023-03-27T10:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.