SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented
Dialog System
- URL: http://arxiv.org/abs/2009.10447v3
- Date: Thu, 26 Aug 2021 08:55:20 GMT
- Title: SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented
Dialog System
- Authors: Hwaran Lee, Seokhwan Jo, HyungJun Kim, Sangkeun Jung, Tae-Yoon Kim
- Abstract summary: We present an effective multi-domain end-to-end trainable neural dialog system SUMBT+LaRL.
Specifically, the SUMBT+ estimates user-acts as well as dialog belief states, and the LaRL models latent system action spaces and generates responses.
Our model achieved the new state-of-the-art success rate of 85.4% on corpus-based evaluation, and a comparable success rate of 81.40% on simulator-based evaluation.
- Score: 6.73550057218157
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The recent advent of neural approaches for developing each dialog component
in task-oriented dialog systems has remarkably improved, yet optimizing the
overall system performance remains a challenge. Besides, previous research on
modeling complicated multi-domain goal-oriented dialogs in end-to-end fashion
has been limited. In this paper, we present an effective multi-domain
end-to-end trainable neural dialog system SUMBT+LaRL that incorporates two
previous strong models and facilitates them to be fully differentiable.
Specifically, the SUMBT+ estimates user-acts as well as dialog belief states,
and the LaRL models latent system action spaces and generates responses given
the estimated contexts. We emphasize that the training framework of three steps
significantly and stably increase dialog success rates: separately pretraining
the SUMBT+ and LaRL, fine-tuning the entire system, and then reinforcement
learning of dialog policy. We also introduce new reward criteria of
reinforcement learning for dialog policy training. Then, we discuss
experimental results depending on the reward criteria and different dialog
evaluation methods. Consequently, our model achieved the new state-of-the-art
success rate of 85.4% on corpus-based evaluation, and a comparable success rate
of 81.40% on simulator-based evaluation provided by the DSTC8 challenge. To our
best knowledge, our work is the first comprehensive study of a modularized E2E
multi-domain dialog system that learning from each component to the entire
dialog policy for task success.
Related papers
- DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval.
Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP.
To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z) - Enhancing Large Language Model Induced Task-Oriented Dialogue Systems
Through Look-Forward Motivated Goals [76.69419538047813]
ProToD approach anticipates the future dialogue actions and incorporates the goal-oriented reward signal to enhance ToD systems.
We present a novel evaluation method that assesses ToD systems based on goal-driven dialogue simulations.
Empirical experiments conducted on the MultiWoZ 2.1 dataset demonstrate that our model can achieve superior performance using only 10% of the data.
arXiv Detail & Related papers (2023-09-16T10:56:00Z) - JoTR: A Joint Transformer and Reinforcement Learning Framework for
Dialog Policy Learning [53.83063435640911]
Dialogue policy learning (DPL) is a crucial component of dialogue modelling.
We introduce a novel framework, JoTR, to generate flexible dialogue actions.
Unlike traditional methods, JoTR formulates a word-level policy that allows for a more dynamic and adaptable dialogue action generation.
arXiv Detail & Related papers (2023-09-01T03:19:53Z) - Interactive Evaluation of Dialog Track at DSTC9 [8.2208199207543]
The Interactive Evaluation of Dialog Track was introduced at the 9th Dialog System Technology Challenge.
This paper provides an overview of the track, including the methodology and results.
arXiv Detail & Related papers (2022-07-28T22:54:04Z) - "Think Before You Speak": Improving Multi-Action Dialog Policy by
Planning Single-Action Dialogs [33.78889030078026]
Multi-action dialog policy (MADP) generates multiple atomic dialog actions per turn.
We propose Planning Enhanced Dialog Policy (PEDP), a novel multi-task learning framework that learns single-action dialog dynamics.
Our fully supervised learning-based method achieves a solid task success rate of 90.6%, improving 3% compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-04-25T07:55:53Z) - GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with
Semi-Supervised Learning and Explicit Policy Injection [36.77204909711832]
We propose a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora.
Specifically, we introduce a dialog act prediction task for policy optimization during pre-training and employ a consistency regularization term to refine the learned representation.
Empirical results show that GALAXY substantially improves the performance of task-oriented dialog systems.
arXiv Detail & Related papers (2021-11-29T15:24:36Z) - UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2 [10.994360742583261]
UBAR is acquired by fine-tuning the large pre-trained unidirectional language model GPT-2 on the sequence of the entire dialog session.
UBAR achieves state-of-the-art performances in multiple settings, improving the combined score of response generation, policy optimization, and end-to-end modeling by 4.7, 3.5, and 9.4 points respectively.
arXiv Detail & Related papers (2020-12-07T09:08:16Z) - Modelling Hierarchical Structure between Dialogue Policy and Natural
Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations.
We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z) - Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical
Analysis of System-wise Evaluation [114.48767388174218]
This paper presents an empirical analysis on different types of dialog systems composed of different modules in different settings.
Our results show that a pipeline dialog system trained using fine-grained supervision signals at different component levels often obtains better performance than the systems that use joint or end-to-end models trained on coarse-grained labels.
arXiv Detail & Related papers (2020-05-15T05:20:06Z) - Recent Advances and Challenges in Task-oriented Dialog System [63.82055978899631]
Task-oriented dialog systems are attracting more and more attention in academic and industrial communities.
We discuss three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning, and (3) integrating domain knowledge into the dialog model.
arXiv Detail & Related papers (2020-03-17T01:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.