PLACES: Prompting Language Models for Social Conversation Synthesis
- URL: http://arxiv.org/abs/2302.03269v2
- Date: Wed, 8 Feb 2023 02:33:46 GMT
- Title: PLACES: Prompting Language Models for Social Conversation Synthesis
- Authors: Maximillian Chen, Alexandros Papangelis, Chenyang Tao, Seokhwan Kim,
Andy Rosenbaum, Yang Liu, Zhou Yu, Dilek Hakkani-Tur
- Abstract summary: We use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting.
We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations.
- Score: 103.94325597273316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Collecting high quality conversational data can be very expensive for most
applications and infeasible for others due to privacy, ethical, or similar
concerns. A promising direction to tackle this problem is to generate synthetic
dialogues by prompting large language models. In this work, we use a small set
of expert-written conversations as in-context examples to synthesize a social
conversation dataset using prompting. We perform several thorough evaluations
of our synthetic conversations compared to human-collected conversations. This
includes various dimensions of conversation quality with human evaluation
directly on the synthesized conversations, and interactive human evaluation of
chatbots fine-tuned on the synthetically generated dataset. We additionally
demonstrate that this prompting approach is generalizable to multi-party
conversations, providing potential to create new synthetic data for multi-party
tasks. Our synthetic multi-party conversations were rated more favorably across
all measured dimensions compared to conversation excerpts sampled from a
human-collected multi-party dataset.
Related papers
- Faithful Persona-based Conversational Dataset Generation with Large
Language Models [10.506653172302222]
High-quality conversational datasets are essential for developing AI models that can communicate with users.
We propose a Generator-Critic architecture framework to expand the initial dataset, while improving the quality of its conversations.
We release Synthetic-Persona-Chat, consisting of 20k conversations seeded from Persona-Chat.
arXiv Detail & Related papers (2023-12-15T18:23:50Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - Does Collaborative Human-LM Dialogue Generation Help Information
Extraction from Human Dialogues? [55.28340832822234]
Problem-solving human dialogues in real applications can be much more complex than existing Wizard-of-Oz collections.
We introduce a human-in-the-loop dialogue generation framework capable of synthesizing realistic dialogues.
arXiv Detail & Related papers (2023-07-13T20:02:50Z) - NatCS: Eliciting Natural Customer Support Dialogues [5.398732055835996]
Existing task-oriented dialogue datasets are not representative of real customer support conversations.
We introduce NatCS, a multi-domain collection of spoken customer service conversations.
arXiv Detail & Related papers (2023-05-04T17:25:24Z) - Talk the Walk: Synthetic Data Generation for Conversational Music
Recommendation [62.019437228000776]
We present TalkWalk, which generates realistic high-quality conversational data by leveraging encoded expertise in widely available item collections.
We generate over one million diverse conversations in a human-collected dataset.
arXiv Detail & Related papers (2023-01-27T01:54:16Z) - Knowledge-Grounded Conversational Data Augmentation with Generative
Conversational Networks [76.11480953550013]
We take a step towards automatically generating conversational data using Generative Conversational Networks.
We evaluate our approach on conversations with and without knowledge on the Topical Chat dataset.
arXiv Detail & Related papers (2022-07-22T22:37:14Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - Summary Grounded Conversation Generation [10.470157142861174]
We show how pre-trained language models can be used to generate entire conversations, given only a summary of a conversation as the input.
We also show that the accuracy of conversation summarization can be improved by augmenting a conversation summarization dataset with generated conversations.
arXiv Detail & Related papers (2021-06-07T04:46:31Z) - A Taxonomy of Empathetic Response Intents in Human Social Conversations [1.52292571922932]
Open-domain conversational agents are becoming increasingly popular in the natural language processing community.
One of the challenges is enabling them to converse in an empathetic manner.
Current neural response generation methods rely solely on end-to-end learning from large scale conversation data to generate dialogues.
Recent work has shown the promise of combining dialogue act/intent modelling and neural response generation.
arXiv Detail & Related papers (2020-12-07T21:56:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.