MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
- URL: http://arxiv.org/abs/2009.12005v2
- Date: Mon, 28 Sep 2020 06:43:17 GMT
- Title: MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
- Authors: Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Abstract summary: We propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems.
MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models.
We instantiate our learning framework with two pre-trained backbones: T5 and BART, and evaluate them on MultiWOZ.
- Score: 75.43457658815943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify
the system design process of task-oriented dialogue systems and alleviate the
over-dependency on annotated data. MinTL is a simple yet effective transfer
learning framework, which allows us to plug-and-play pre-trained seq2seq
models, and jointly learn dialogue state tracking and dialogue response
generation. Unlike previous approaches, which use a copy mechanism to
"carryover" the old dialogue states to the new one, we introduce Levenshtein
belief spans (Lev), that allows efficient dialogue state tracking with a
minimal generation length. We instantiate our learning framework with two
pre-trained backbones: T5 and BART, and evaluate them on MultiWOZ. Extensive
experiments demonstrate that: 1) our systems establish new state-of-the-art
results on end-to-end response generation, 2) MinTL-based systems are more
robust than baseline methods in the low resource setting, and they achieve
competitive results with only 20\% training data, and 3) Lev greatly improves
the inference efficiency.
Related papers
- Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Quick Starting Dialog Systems with Paraphrase Generation [0.0]
We propose a method to reduce the cost and effort of creating new conversational agents by artificially generating more data from existing examples.
Our proposed approach can kick-start a dialog system with little human effort, and brings its performance to a level satisfactory enough for allowing actual interactions with real end-users.
arXiv Detail & Related papers (2022-04-06T02:35:59Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - Self-training Improves Pre-training for Few-shot Learning in
Task-oriented Dialog Systems [47.937191088981436]
Large-scale pre-trained language models, have shown promising results for few-shot learning in ToD.
We propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model.
We conduct experiments and present analyses on four downstream tasks in ToD, including intent classification, dialog state tracking, dialog act prediction, and response selection.
arXiv Detail & Related papers (2021-08-28T07:22:06Z) - Transferable Dialogue Systems and User Simulators [17.106518400787156]
One of the difficulties in training dialogue systems is the lack of training data.
We explore the possibility of creating dialogue data through the interaction between a dialogue system and a user simulator.
We develop a modelling framework that can incorporate new dialogue scenarios through self-play between the two agents.
arXiv Detail & Related papers (2021-07-25T22:59:09Z) - An Empirical Study of Cross-Lingual Transferability in Generative
Dialogue State Tracker [33.2309643963072]
We study the transferability of a cross-lingual generative dialogue state tracking system using a multilingual pre-trained seq2seq model.
We also find out the low cross-lingual transferability of our approaches and provides investigation and discussion.
arXiv Detail & Related papers (2021-01-27T12:45:55Z) - SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine
Teaching [81.45928589522032]
We parameterize modular task-oriented dialog systems using a Transformer-based auto-regressive language model.
We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model.
Experiments show that SOLOIST creates new state-of-the-art on well-studied task-oriented dialog benchmarks.
arXiv Detail & Related papers (2020-05-11T17:58:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.