NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-Based
Simulation
- URL: http://arxiv.org/abs/2105.14454v1
- Date: Sun, 30 May 2021 07:54:54 GMT
- Title: NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-Based
Simulation
- Authors: Sungdong Kim, Minsuk Chang and Sang-Woo Lee
- Abstract summary: We propose NeuralWOZ, a novel dialogue collection framework that uses model-based dialogue simulation.
Collector generates dialogues from (1) user's goal instructions, which are the user context and task constraints in natural language, and (2) system's API call results.
Labeler annotates the generated dialogue by formulating the annotation as a multiple-choice problem, in which the candidate labels are extracted from goal instructions and API call results.
- Score: 13.943378554273377
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose NeuralWOZ, a novel dialogue collection framework that uses
model-based dialogue simulation. NeuralWOZ has two pipelined models, Collector
and Labeler. Collector generates dialogues from (1) user's goal instructions,
which are the user context and task constraints in natural language, and (2)
system's API call results, which is a list of possible query responses for user
requests from the given knowledge base. Labeler annotates the generated
dialogue by formulating the annotation as a multiple-choice problem, in which
the candidate labels are extracted from goal instructions and API call results.
We demonstrate the effectiveness of the proposed method in the zero-shot domain
transfer learning for dialogue state tracking. In the evaluation, the synthetic
dialogue corpus generated from NeuralWOZ achieves a new state-of-the-art with
improvements of 4.4% point joint goal accuracy on average across domains, and
improvements of 5.7% point of zero-shot coverage against the MultiWOZ 2.1
dataset.
Related papers
- DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Quick Starting Dialog Systems with Paraphrase Generation [0.0]
We propose a method to reduce the cost and effort of creating new conversational agents by artificially generating more data from existing examples.
Our proposed approach can kick-start a dialog system with little human effort, and brings its performance to a level satisfactory enough for allowing actual interactions with real end-users.
arXiv Detail & Related papers (2022-04-06T02:35:59Z) - Dialog Simulation with Realistic Variations for Training Goal-Oriented
Conversational Systems [14.206866126142002]
Goal-oriented dialog systems enable users to complete specific goals like requesting information about a movie or booking a ticket.
We propose an approach for automatically creating a large corpus of annotated dialogs from a few thoroughly annotated sample dialogs and the dialog schema.
We achieve 18? 50% relative accuracy on a held-out test set compared to a baseline dialog generation approach.
arXiv Detail & Related papers (2020-11-16T19:39:15Z) - RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich
Semantic Annotations for Task-Oriented Dialogue Modeling [35.75880078666584]
RiSAWOZ is a large-scale multi-domain Chinese Wizard-of-Oz dataset with Rich Semantic s.
It contains 11.2K human-to-human (H2H) multi-turn semantically annotated dialogues, with more than 150K utterances spanning over 12 domains.
arXiv Detail & Related papers (2020-10-17T08:18:59Z) - Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired
Data [61.71319905364992]
We propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data.
A data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data.
A ranking module is employed to filter out low-quality dialogues.
A model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs.
arXiv Detail & Related papers (2020-09-20T13:06:38Z) - Modeling Long Context for Task-Oriented Dialogue State Generation [51.044300192906995]
We propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model.
Our approaches attempt to solve the problem that the performance of the baseline significantly drops when the input dialogue context sequence is long.
In our experiments, our proposed model achieves a 7.03% relative improvement over the baseline, establishing a new state-of-the-art joint goal accuracy of 52.04% on the MultiWOZ 2.0 dataset.
arXiv Detail & Related papers (2020-04-29T11:02:25Z) - Variational Hierarchical Dialog Autoencoder for Dialog State Tracking
Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs.
We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs.
Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.