TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue
- URL: http://arxiv.org/abs/2004.06871v3
- Date: Thu, 1 Oct 2020 16:34:52 GMT
- Title: TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue
- Authors: Chien-Sheng Wu, Steven Hoi, Richard Socher, and Caiming Xiong
- Abstract summary: In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
- Score: 113.45485470103762
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The underlying difference of linguistic patterns between general text and
task-oriented dialogue makes existing pre-trained language models less useful
in practice. In this work, we unify nine human-human and multi-turn
task-oriented dialogue datasets for language modeling. To better model dialogue
behavior during pre-training, we incorporate user and system tokens into the
masked language modeling. We propose a contrastive objective function to
simulate the response selection task. Our pre-trained task-oriented dialogue
BERT (TOD-BERT) outperforms strong baselines like BERT on four downstream
task-oriented dialogue applications, including intention recognition, dialogue
state tracking, dialogue act prediction, and response selection. We also show
that TOD-BERT has a stronger few-shot ability that can mitigate the data
scarcity problem for task-oriented dialogue.
Related papers
- BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning
Diverse Responses [24.79881150845294]
We propose a novel dialogue pre-training model called BootTOD.
It learns task-oriented dialogue representations via a self-bootstrapping framework.
BootTOD outperforms strong TOD baselines on diverse downstream dialogue tasks.
arXiv Detail & Related papers (2024-03-02T10:34:11Z) - FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for
Task-Oriented Dialogue [20.79359173822053]
We propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context.
Our intuition is that a good dialogue representation both learns local context information and predicts future information.
arXiv Detail & Related papers (2023-06-17T10:40:07Z) - Adapting Task-Oriented Dialogue Models for Email Conversations [4.45709593827781]
In this paper, we provide an effective transfer learning framework (EMToD) that allows the latest development in dialogue models to be adapted for long-form conversations.
We show that the proposed EMToD framework improves intent detection performance over pre-trained language models by 45% and over pre-trained dialogue models by 30% for task-oriented email conversations.
arXiv Detail & Related papers (2022-08-19T16:41:34Z) - Improving Zero and Few-shot Generalization in Dialogue through
Instruction Tuning [27.92734269206744]
InstructDial is an instruction tuning framework for dialogue.
It consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets.
Our analysis reveals that InstructDial enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting.
arXiv Detail & Related papers (2022-05-25T11:37:06Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue
Modeling on Spoken Conversations [24.245354500835465]
We propose a novel model-agnostic data augmentation paradigm to boost the robustness of task-oriented dialogue modeling on spoken conversations.
Our approach ranked first in both tasks of DSTC10 Track2, a benchmark for task-oriented dialogue modeling on spoken conversations.
arXiv Detail & Related papers (2021-12-23T10:04:25Z) - UniDS: A Unified Dialogue System for Chit-Chat and Task-oriented
Dialogues [59.499965460525694]
We propose a unified dialogue system (UniDS) with the two aforementioned skills.
We design a unified dialogue data schema, compatible for both chit-chat and task-oriented dialogues.
We train UniDS with mixed dialogue data from a pretrained chit-chat dialogue model.
arXiv Detail & Related papers (2021-10-15T11:56:47Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.