SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog
Understanding and Generation
- URL: http://arxiv.org/abs/2209.06664v1
- Date: Wed, 14 Sep 2022 14:17:57 GMT
- Title: SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog
Understanding and Generation
- Authors: Wanwei He, Yinpei Dai, Min Yang, Jian Sun, Fei Huang, Luo Si, Yongbin
Li
- Abstract summary: SPACE-3 is a novel unified semi-supervised pre-trained conversation model learning from large-scale dialog corpora.
It can be effectively fine-tuned on a wide range of downstream dialog tasks.
Results show that SPACE-3 achieves state-of-the-art performance on eight downstream dialog benchmarks.
- Score: 123.37377363355363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, pre-training methods have shown remarkable success in task-oriented
dialog (TOD) systems. However, most existing pre-trained models for TOD focus
on either dialog understanding or dialog generation, but not both. In this
paper, we propose SPACE-3, a novel unified semi-supervised pre-trained
conversation model learning from large-scale dialog corpora with limited
annotations, which can be effectively fine-tuned on a wide range of downstream
dialog tasks. Specifically, SPACE-3 consists of four successive components in a
single transformer to maintain a task-flow in TOD systems: (i) a dialog
encoding module to encode dialog history, (ii) a dialog understanding module to
extract semantic vectors from either user queries or system responses, (iii) a
dialog policy module to generate a policy vector that contains high-level
semantics of the response, and (iv) a dialog generation module to produce
appropriate responses. We design a dedicated pre-training objective for each
component. Concretely, we pre-train the dialog encoding module with span mask
language modeling to learn contextualized dialog information. To capture the
structured dialog semantics, we pre-train the dialog understanding module via a
novel tree-induced semi-supervised contrastive learning objective with the help
of extra dialog annotations. In addition, we pre-train the dialog policy module
by minimizing the L2 distance between its output policy vector and the semantic
vector of the response for policy optimization. Finally, the dialog generation
model is pre-trained by language modeling. Results show that SPACE-3 achieves
state-of-the-art performance on eight downstream dialog benchmarks, including
intent prediction, dialog state tracking, and end-to-end dialog modeling. We
also show that SPACE-3 has a stronger few-shot ability than existing models
under the low-resource setting.
Related papers
- Contextual Data Augmentation for Task-Oriented Dialog Systems [8.085645180329417]
We develop a novel dialog augmentation model that generates a user turn, conditioning on full dialog context.
With a new prompt design for language model, and output re-ranking, the dialogs generated from our model can be directly used to train downstream dialog systems.
arXiv Detail & Related papers (2023-10-16T13:22:34Z) - DialogStudio: Towards Richest and Most Diverse Unified Dataset
Collection for Conversational AI [92.29874802394167]
DialogStudio is the largest and most diverse collection of dialogue datasets.
Our collection encompasses data from open-domain dialogues, task-oriented dialogues, natural language understanding, conversational recommendation, dialogue summarization, and knowledge-grounded dialogues.
arXiv Detail & Related papers (2023-07-19T17:57:53Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for
Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses.
We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z) - GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with
Semi-Supervised Learning and Explicit Policy Injection [36.77204909711832]
We propose a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora.
Specifically, we introduce a dialog act prediction task for policy optimization during pre-training and employ a consistency regularization term to refine the learned representation.
Empirical results show that GALAXY substantially improves the performance of task-oriented dialog systems.
arXiv Detail & Related papers (2021-11-29T15:24:36Z) - Conversation Learner -- A Machine Teaching Tool for Building Dialog
Managers for Task-Oriented Dialog Systems [57.082447660944965]
Conversation Learner is a machine teaching tool for building dialog managers.
It enables dialog authors to create a dialog flow using familiar tools, converting the dialog flow into a parametric model.
It allows dialog authors to improve the dialog manager over time by leveraging user-system dialog logs as training data.
arXiv Detail & Related papers (2020-04-09T00:10:54Z) - Variational Hierarchical Dialog Autoencoder for Dialog State Tracking
Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs.
We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs.
Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.