Learning from Relevant Subgoals in Successful Dialogs using Iterative Training for Task-oriented Dialog Systems
- URL: http://arxiv.org/abs/2411.16305v1
- Date: Mon, 25 Nov 2024 11:47:31 GMT
- Title: Learning from Relevant Subgoals in Successful Dialogs using Iterative Training for Task-oriented Dialog Systems
- Authors: Magdalena Kaiser, Patrick Ernst, György Szarvas,
- Abstract summary: Task-oriented Dialog (ToD) systems have to solve multiple subgoals to accomplish user goals, whereas feedback is often obtained only at the end of the dialog.
We propose SUIT (SUbgoal-aware ITerative Training), an iterative training approach for improving ToD systems.
- Score: 6.226609932118126
- License:
- Abstract: Task-oriented Dialog (ToD) systems have to solve multiple subgoals to accomplish user goals, whereas feedback is often obtained only at the end of the dialog. In this work, we propose SUIT (SUbgoal-aware ITerative Training), an iterative training approach for improving ToD systems. We sample dialogs from the model we aim to improve and determine subgoals that contribute to dialog success using distant supervision to obtain high quality training samples. We show how this data improves supervised fine-tuning or, alternatively, preference learning results. SUIT is able to iteratively generate more data instead of relying on fixed static sets. SUIT reaches new state-of-the-art performance on a popular ToD benchmark.
Related papers
- Improving Multi-Domain Task-Oriented Dialogue System with Offline Reinforcement Learning [5.18578230548982]
Task-oriented dialogue (TOD) system is designed to accomplish user-defined tasks through dialogues.
We propose a TOD system that leverages a unified pre-trained language model, GPT2, as a base model.
We show that our model increases the inform rate by 1.60% and the success rate by 3.17% compared to the baseline.
arXiv Detail & Related papers (2024-11-08T05:43:40Z) - Enhancing Large Language Model Induced Task-Oriented Dialogue Systems
Through Look-Forward Motivated Goals [76.69419538047813]
ProToD approach anticipates the future dialogue actions and incorporates the goal-oriented reward signal to enhance ToD systems.
We present a novel evaluation method that assesses ToD systems based on goal-driven dialogue simulations.
Empirical experiments conducted on the MultiWoZ 2.1 dataset demonstrate that our model can achieve superior performance using only 10% of the data.
arXiv Detail & Related papers (2023-09-16T10:56:00Z) - Discovering Customer-Service Dialog System with Semi-Supervised Learning
and Coarse-to-Fine Intent Detection [6.869753194843482]
Task-oriented dialog aims to assist users in achieving specific goals through multi-turn conversation.
We constructed a weakly supervised dataset based on a teacher/student paradigm.
We also built a modular dialogue system and integrated coarse-to-fine grained classification for user intent detection.
arXiv Detail & Related papers (2022-12-23T14:36:43Z) - Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with
User Simulator [37.590563896382456]
We propose an interactive evaluation framework for Task-Oriented Dialogue (TOD) systems.
We first build a goal-oriented user simulator based on pre-trained models and then use the user simulator to interact with the dialogue system to generate dialogues.
Experimental results show that RL-based TOD systems trained by our proposed user simulator can achieve nearly 98% inform and success rates.
arXiv Detail & Related papers (2022-10-26T07:41:32Z) - DialogZoo: Large-Scale Dialog-Oriented Task Learning [52.18193690394549]
We aim to build a unified foundation model which can solve massive diverse dialogue tasks.
To achieve this goal, we first collect a large-scale well-labeled dialogue dataset from 73 publicly available datasets.
arXiv Detail & Related papers (2022-05-25T11:17:16Z) - Self-training Improves Pre-training for Few-shot Learning in
Task-oriented Dialog Systems [47.937191088981436]
Large-scale pre-trained language models, have shown promising results for few-shot learning in ToD.
We propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model.
We conduct experiments and present analyses on four downstream tasks in ToD, including intent classification, dialog state tracking, dialog act prediction, and response selection.
arXiv Detail & Related papers (2021-08-28T07:22:06Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - Rethinking Supervised Learning and Reinforcement Learning in
Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods.
Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z) - SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine
Teaching [81.45928589522032]
We parameterize modular task-oriented dialog systems using a Transformer-based auto-regressive language model.
We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model.
Experiments show that SOLOIST creates new state-of-the-art on well-studied task-oriented dialog benchmarks.
arXiv Detail & Related papers (2020-05-11T17:58:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.