Dialog Simulation with Realistic Variations for Training Goal-Oriented
Conversational Systems
- URL: http://arxiv.org/abs/2011.08243v1
- Date: Mon, 16 Nov 2020 19:39:15 GMT
- Title: Dialog Simulation with Realistic Variations for Training Goal-Oriented
Conversational Systems
- Authors: Chien-Wei Lin, Vincent Auvray, Daniel Elkind, Arijit Biswas, Maryam
Fazel-Zarandi, Nehal Belgamwar, Shubhra Chandra, Matt Zhao, Angeliki
Metallinou, Tagyoung Chung, Charlie Shucheng Zhu, Suranjit Adhikari, Dilek
Hakkani-Tur
- Abstract summary: Goal-oriented dialog systems enable users to complete specific goals like requesting information about a movie or booking a ticket.
We propose an approach for automatically creating a large corpus of annotated dialogs from a few thoroughly annotated sample dialogs and the dialog schema.
We achieve 18? 50% relative accuracy on a held-out test set compared to a baseline dialog generation approach.
- Score: 14.206866126142002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Goal-oriented dialog systems enable users to complete specific goals like
requesting information about a movie or booking a ticket. Typically the dialog
system pipeline contains multiple ML models, including natural language
understanding, state tracking and action prediction (policy learning). These
models are trained through a combination of supervised or reinforcement
learning methods and therefore require collection of labeled domain specific
datasets. However, collecting annotated datasets with language and dialog-flow
variations is expensive, time-consuming and scales poorly due to human
involvement. In this paper, we propose an approach for automatically creating a
large corpus of annotated dialogs from a few thoroughly annotated sample
dialogs and the dialog schema. Our approach includes a novel goal-sampling
technique for sampling plausible user goals and a dialog simulation technique
that uses heuristic interplay between the user and the system (Alexa), where
the user tries to achieve the sampled goal. We validate our approach by
generating data and training three different downstream conversational ML
models. We achieve 18 ? 50% relative accuracy improvements on a held-out test
set compared to a baseline dialog generation approach that only samples natural
language and entity value variations from existing catalogs but does not
generate any novel dialog flow variations. We also qualitatively establish that
the proposed approach is better than the baseline. Moreover, several different
conversational experiences have been built using this method, which enables
customers to have a wide variety of conversations with Alexa.
Related papers
- In-Context Learning User Simulators for Task-Oriented Dialog Systems [1.7086737326992172]
This paper presents a novel application of large language models in user simulation for task-oriented dialog systems.
By harnessing the power of these models, the proposed approach generates diverse utterances based on user goals and limited dialog examples.
arXiv Detail & Related papers (2023-06-01T15:06:11Z) - DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning.
Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement.
Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z) - SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog
Understanding and Generation [123.37377363355363]
SPACE-3 is a novel unified semi-supervised pre-trained conversation model learning from large-scale dialog corpora.
It can be effectively fine-tuned on a wide range of downstream dialog tasks.
Results show that SPACE-3 achieves state-of-the-art performance on eight downstream dialog benchmarks.
arXiv Detail & Related papers (2022-09-14T14:17:57Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z) - NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-Based
Simulation [13.943378554273377]
We propose NeuralWOZ, a novel dialogue collection framework that uses model-based dialogue simulation.
Collector generates dialogues from (1) user's goal instructions, which are the user context and task constraints in natural language, and (2) system's API call results.
Labeler annotates the generated dialogue by formulating the annotation as a multiple-choice problem, in which the candidate labels are extracted from goal instructions and API call results.
arXiv Detail & Related papers (2021-05-30T07:54:54Z) - Alexa Conversations: An Extensible Data-driven Approach for Building
Task-oriented Dialogue Systems [21.98135285833616]
Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation.
We present a new approach for building goal-oriented dialogue systems that is scalable, as well as data efficient.
arXiv Detail & Related papers (2021-04-19T07:09:27Z) - Variational Hierarchical Dialog Autoencoder for Dialog State Tracking
Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs.
We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs.
Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.