Knowledge-Grounded Conversational Data Augmentation with Generative
Conversational Networks
- URL: http://arxiv.org/abs/2207.11363v1
- Date: Fri, 22 Jul 2022 22:37:14 GMT
- Title: Knowledge-Grounded Conversational Data Augmentation with Generative
Conversational Networks
- Authors: Yen-Ting Lin, Alexandros Papangelis, Seokhwan Kim, Dilek Hakkani-Tur
- Abstract summary: We take a step towards automatically generating conversational data using Generative Conversational Networks.
We evaluate our approach on conversations with and without knowledge on the Topical Chat dataset.
- Score: 76.11480953550013
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While rich, open-domain textual data are generally available and may include
interesting phenomena (humor, sarcasm, empathy, etc.) most are designed for
language processing tasks, and are usually in a non-conversational format. In
this work, we take a step towards automatically generating conversational data
using Generative Conversational Networks, aiming to benefit from the breadth of
available language and knowledge data, and train open domain social
conversational agents. We evaluate our approach on conversations with and
without knowledge on the Topical Chat dataset using automatic metrics and human
evaluators. Our results show that for conversations without knowledge
grounding, GCN can generalize from the seed data, producing novel conversations
that are less relevant but more engaging and for knowledge-grounded
conversations, it can produce more knowledge-focused, fluent, and engaging
conversations. Specifically, we show that for open-domain conversations with
10\% of seed data, our approach performs close to the baseline that uses 100%
of the data, while for knowledge-grounded conversations, it achieves the same
using only 1% of the data, on human ratings of engagingness, fluency, and
relevance.
Related papers
- Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations [8.03111197961603]
Building socialbots that can have deep, engaging open-domain conversations with humans is one of the grand challenges of artificial intelligence (AI)
We introduce Topical-Chat, a knowledge-grounded human-human conversation dataset where the underlying knowledge spans 8 broad topics and conversation partners don't have explicitly defined roles.
We also train several state-of-the-art encoder-decoder conversational models on Topical-Chat and perform automated and human evaluation for benchmarking.
arXiv Detail & Related papers (2023-08-23T08:33:14Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue
Model [79.64376762489164]
PK-Chat is a Pointer network guided generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs.
The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge.
Based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences.
arXiv Detail & Related papers (2023-04-02T18:23:13Z) - PLACES: Prompting Language Models for Social Conversation Synthesis [103.94325597273316]
We use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting.
We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations.
arXiv Detail & Related papers (2023-02-07T05:48:16Z) - Grounding in social media: An approach to building a chit-chat dialogue
model [9.247397520986999]
Building open-domain dialogue systems capable of rich human-like conversational ability is one of the fundamental challenges in language generation.
Current work on knowledge-grounded dialogue generation primarily focuses on persona incorporation or searching a fact-based structured knowledge source such as Wikipedia.
Our method takes a broader and simpler approach, which aims to improve the raw conversation ability of the system by mimicking the human response behavior on social media.
arXiv Detail & Related papers (2022-06-12T09:01:57Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - A Neural Conversation Generation Model via Equivalent Shared Memory
Investigation [39.922967513749654]
We propose a novel reading and memory framework called Deep Reading Memory Network (DRMN)
DRMN is capable of remembering useful information of similar conversations for improving utterance generation.
We apply our model to two large-scale conversation datasets of justice and e-commerce fields.
arXiv Detail & Related papers (2021-08-20T13:20:14Z) - Summary Grounded Conversation Generation [10.470157142861174]
We show how pre-trained language models can be used to generate entire conversations, given only a summary of a conversation as the input.
We also show that the accuracy of conversation summarization can be improved by augmenting a conversation summarization dataset with generated conversations.
arXiv Detail & Related papers (2021-06-07T04:46:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.