UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2
- URL: http://arxiv.org/abs/2012.03539v2
- Date: Thu, 18 Mar 2021 02:34:26 GMT
- Title: UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2
- Authors: Yunyi Yang, Yunhao Li, Xiaojun Quan
- Abstract summary: UBAR is acquired by fine-tuning the large pre-trained unidirectional language model GPT-2 on the sequence of the entire dialog session.
UBAR achieves state-of-the-art performances in multiple settings, improving the combined score of response generation, policy optimization, and end-to-end modeling by 4.7, 3.5, and 9.4 points respectively.
- Score: 10.994360742583261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents our task-oriented dialog system UBAR which models
task-oriented dialogs on a dialog session level. Specifically, UBAR is acquired
by fine-tuning the large pre-trained unidirectional language model GPT-2 on the
sequence of the entire dialog session which is composed of user utterance,
belief state, database result, system act, and system response of every dialog
turn. Additionally, UBAR is evaluated in a more realistic setting, where its
dialog context has access to user utterances and all content it generated such
as belief states, system acts, and system responses. Experimental results on
the MultiWOZ datasets show that UBAR achieves state-of-the-art performances in
multiple settings, improving the combined score of response generation, policy
optimization, and end-to-end modeling by 4.7, 3.5, and 9.4 points respectively.
Thorough analyses demonstrate that the session-level training sequence
formulation and the generated dialog context are essential for UBAR to operate
as a fully end-to-end task-oriented dialog system in real life. We also examine
the transfer ability of UBAR to new domains with limited data and provide
visualization and a case study to illustrate the advantages of UBAR in modeling
on a dialog session level.
Related papers
- VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue [70.64560638766018]
We propose textbfVDialogUE, a textbfVisually-grounded textbfDialogue benchmark for textbfUnified textbfEvaluation.
It defines five core multi-modal dialogue tasks and covers six datasets.
We also present a straightforward yet efficient baseline model, named textbfVISIT(textbfVISually-grounded dtextbfIalog textbfTransformer), to promote the advancement of
arXiv Detail & Related papers (2023-09-14T02:09:20Z) - FCC: Fusing Conversation History and Candidate Provenance for Contextual
Response Ranking in Dialogue Systems [53.89014188309486]
We present a flexible neural framework that can integrate contextual information from multiple channels.
We evaluate our model on the MSDialog dataset widely used for evaluating conversational response ranking tasks.
arXiv Detail & Related papers (2023-03-31T23:58:28Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Dialog Acts for Task-Driven Embodied Agents [10.275619475149433]
Embodied agents need to be able to interact in natural language understanding task descriptions and asking appropriate follow up questions.
We propose a set of dialog acts for modelling such dialogs and annotate the TEACh dataset that includes over 3,000 situated, task oriented conversations.
We demonstrate the use of this annotated dataset in training models for tagging the dialog acts of a given utterance, predicting the dialog act of the next response given a dialog history, and use the dialog acts to guide agent's non-dialog behaviour.
arXiv Detail & Related papers (2022-09-26T18:41:28Z) - SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog
Understanding and Generation [123.37377363355363]
SPACE-3 is a novel unified semi-supervised pre-trained conversation model learning from large-scale dialog corpora.
It can be effectively fine-tuned on a wide range of downstream dialog tasks.
Results show that SPACE-3 achieves state-of-the-art performance on eight downstream dialog benchmarks.
arXiv Detail & Related papers (2022-09-14T14:17:57Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - Interactive Evaluation of Dialog Track at DSTC9 [8.2208199207543]
The Interactive Evaluation of Dialog Track was introduced at the 9th Dialog System Technology Challenge.
This paper provides an overview of the track, including the methodology and results.
arXiv Detail & Related papers (2022-07-28T22:54:04Z) - GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with
Semi-Supervised Learning and Explicit Policy Injection [36.77204909711832]
We propose a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora.
Specifically, we introduce a dialog act prediction task for policy optimization during pre-training and employ a consistency regularization term to refine the learned representation.
Empirical results show that GALAXY substantially improves the performance of task-oriented dialog systems.
arXiv Detail & Related papers (2021-11-29T15:24:36Z) - Overview of the Ninth Dialog System Technology Challenge: DSTC9 [111.35889309106359]
The Ninth Dialog System Technology Challenge (DSTC-9) focuses on applying end-to-end dialog technologies for four distinct tasks in dialog systems.
This paper describes the task definition, provided datasets, baselines and evaluation set-up for each track.
arXiv Detail & Related papers (2020-11-12T16:43:10Z) - SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented
Dialog System [6.73550057218157]
We present an effective multi-domain end-to-end trainable neural dialog system SUMBT+LaRL.
Specifically, the SUMBT+ estimates user-acts as well as dialog belief states, and the LaRL models latent system action spaces and generates responses.
Our model achieved the new state-of-the-art success rate of 85.4% on corpus-based evaluation, and a comparable success rate of 81.40% on simulator-based evaluation.
arXiv Detail & Related papers (2020-09-22T11:02:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.