Related papers: MCPDial: A Minecraft Persona-driven Dialogue Dataset

MCPDial: A Minecraft Persona-driven Dialogue Dataset

URL: http://arxiv.org/abs/2410.21627v1
Date: Tue, 29 Oct 2024 00:19:55 GMT
Title: MCPDial: A Minecraft Persona-driven Dialogue Dataset
Authors: Seyed Hossein Alavi, Sudha Rao, Ashutosh Adhikari, Gabriel A DesGarennes, Akanksha Malhotra, Chris Brockett, Mahmoud Adada, Raymond T. Ng, Vered Shwartz, Bill Dolan,
Abstract summary: We introduce the Minecraft Persona-driven Dialogue dataset (MCPDial) Starting with a small seed of expert-written conversations, we employ our method to generate hundreds of additional conversations. The conversations are long, allowing for in-depth and extensive interactions between the player and NPC.
Score: 22.420926356322568
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel approach that uses large language models (LLMs) to generate persona-driven conversations between Players and Non-Player Characters (NPC) in games. Showcasing the application of our methodology, we introduce the Minecraft Persona-driven Dialogue dataset (MCPDial). Starting with a small seed of expert-written conversations, we employ our method to generate hundreds of additional conversations. Each conversation in the dataset includes rich character descriptions of the player and NPC. The conversations are long, allowing for in-depth and extensive interactions between the player and NPC. MCPDial extends beyond basic conversations by incorporating canonical function calls (e.g. "Call find a resource on iron ore") between the utterances. Finally, we conduct a qualitative analysis of the dataset to assess its quality and characteristics.

Related papers

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction [123.89581506075461]
We propose OmniCharacter, a first seamless speech-language personality interaction model to achieve immersive RPAs with low latency.<n> Specifically, OmniCharacter enables agents to consistently exhibit role-specific personality traits and vocal traits throughout the interaction.<n>Our method yields better responses in terms of both content and style compared to existing RPAs and mainstream speech-language models, with a response latency as low as 289ms.
arXiv Detail & Related papers (2025-05-26T17:55:06Z)
Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions [55.2480439325792]
Role-playing games (RPG) are games in which players interact with one another to create narratives. This emerging form of shared narrative, primarily oral, is receiving increasing attention. In this paper, we aim to discover to what extent the language of Large Language Models (LLMs) exhibit oral or written features when asked to generate an RPG session.
arXiv Detail & Related papers (2025-03-26T15:10:47Z)
What if Red Can Talk? Dynamic Dialogue Generation Using Large Language Models [0.0]
We introduce a dialogue filler framework that utilizes large language models (LLMs) to generate dynamic and contextually appropriate character interactions. We test this framework within the environments of Final Fantasy VII Remake and Pokemon. This study aims to assist developers in crafting more nuanced filler dialogues, thereby enriching player immersion and enhancing the overall RPG experience.
arXiv Detail & Related papers (2024-07-29T19:12:18Z)
LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
We propose a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction. Our method can simulate human-chatbot dialogues with a high indistinguishability rate.
arXiv Detail & Related papers (2024-07-04T14:49:46Z)
Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data [58.92110996840019]
We propose to enhance role-playing language models (RPLMs) via personality-indicative data. Specifically, we leverage questions from psychological scales and distill advanced RPAs to generate dialogues that grasp the minds of characters. Experimental results validate that RPLMs trained with our dataset exhibit advanced role-playing capabilities for both general and personality-related evaluations.
arXiv Detail & Related papers (2024-06-27T06:24:00Z)
Learning Retrieval Augmentation for Personalized Dialogue Generation [29.467644429517325]
This paper studies the potential of leveraging external knowledge for persona dialogue generation. Experiments conducted on the CONVAI2 dataset with ROCStory as a supplementary data source show that the proposed LAPDOG method substantially outperforms the baselines.
arXiv Detail & Related papers (2024-06-27T02:38:13Z)
PersonalityChat: Conversation Distillation for Personalized Dialog Modeling with Facts and Traits [5.447308344436046]
PersonalityChat is a synthetic conversational dataset based upon the popular PersonaChat dataset. We show that the personality trait labels can be used for trait-based personalization of generative dialogue models.
arXiv Detail & Related papers (2024-01-14T20:35:33Z)
Using Natural Language Inference to Improve Persona Extraction from Dialogue in a New Domain [44.05974724495336]
We introduce a natural language inference method for adapting a trained persona extraction model to a new setting. Our method returns higher-quality extracted persona and requires less human annotation.
arXiv Detail & Related papers (2024-01-12T18:25:03Z)
Faithful Persona-based Conversational Dataset Generation with Large Language Models [10.506653172302222]
High-quality conversational datasets are essential for developing AI models that can communicate with users. We propose a Generator-Critic architecture framework to expand the initial dataset, while improving the quality of its conversations. We release Synthetic-Persona-Chat, consisting of 20k conversations seeded from Persona-Chat.
arXiv Detail & Related papers (2023-12-15T18:23:50Z)
MPCHAT: Towards Multimodal Persona-Grounded Conversation [54.800425322314105]
We extend persona-based dialogue to the multimodal domain and make two main contributions. First, we present the first multimodal persona-based dialogue dataset named MPCHAT. Second, we empirically show that incorporating multimodal persona, as measured by three proposed multimodal persona-grounded dialogue tasks, leads to statistically significant performance improvements.
arXiv Detail & Related papers (2023-05-27T06:46:42Z)
Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters [70.84938803753062]
We introduce the Harry Potter Dialogue dataset, designed to advance the study of dialogue agents and character alignment. The dataset encompasses all dialogue sessions (in both English and Chinese) from the Harry Potter series. It is annotated with vital background information, including dialogue scenes, speakers, character relationships, and attributes.
arXiv Detail & Related papers (2022-11-13T10:16:39Z)
A Benchmark for Understanding and Generating Dialogue between Characters in Stories [75.29466820496913]
We present the first study to explore whether machines can understand and generate dialogue in stories. We propose two new tasks including Masked Dialogue Generation and Dialogue Speaker Recognition. We show the difficulty of the proposed tasks by testing existing models with automatic and manual evaluation on DialStory.
arXiv Detail & Related papers (2022-09-18T10:19:04Z)
CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset for Conversational AI [48.67259855309959]
Most existing datasets for conversational AI ignore human personalities and emotions. We propose CPED, a large-scale Chinese personalized and emotional dialogue dataset. CPED contains more than 12K dialogues of 392 speakers from 40 TV shows.
arXiv Detail & Related papers (2022-05-29T17:45:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.