FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for
Task-Oriented Dialogue
- URL: http://arxiv.org/abs/2306.10315v1
- Date: Sat, 17 Jun 2023 10:40:07 GMT
- Title: FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for
Task-Oriented Dialogue
- Authors: Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen
Xian, Weiran Xu
- Abstract summary: We propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context.
Our intuition is that a good dialogue representation both learns local context information and predicts future information.
- Score: 20.79359173822053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models based on general text enable huge success in the
NLP scenario. But the intrinsical difference of linguistic patterns between
general text and task-oriented dialogues makes existing pre-trained language
models less useful in practice. Current dialogue pre-training methods rely on a
contrastive framework and face the challenges of both selecting true positives
and hard negatives. In this paper, we propose a novel dialogue pre-training
model, FutureTOD, which distills future knowledge to the representation of the
previous dialogue context using a self-training framework. Our intuition is
that a good dialogue representation both learns local context information and
predicts future information. Extensive experiments on diverse downstream
dialogue tasks demonstrate the effectiveness of our model, especially the
generalization, robustness, and learning discriminative dialogue
representations capabilities.
Related papers
- Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis
Using Linguistic and Prosodic Contexts of Dialogue History [38.65020349874135]
We propose an end-to-end empathetic dialogue speech synthesis (DSS) model.
Our model is conditioned by the history of linguistic and prosody features for predicting appropriate dialogue context.
To train the empathetic DSS model effectively, we investigate 1) a self-supervised learning model pretrained with large speech corpora, 2) a style-guided training using a prosody embedding of the current utterance to be predicted by the dialogue context embedding, 3) a cross-modal attention to combine text and speech modalities, and 4) a sentence-wise embedding to achieve fine-grained prosody modeling rather than utterance-wise modeling.
arXiv Detail & Related papers (2022-06-16T09:47:25Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - Precognition in Task-oriented Dialogue Understanding: Posterior
Regularization by Future Context [8.59600111891194]
We propose to jointly model historical and future information through the posterior regularization method.
We optimize the KL distance between these to regularize our model during training.
Experiments on two dialogue datasets validate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-07T09:58:50Z) - Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task.
Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts.
Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z) - Advances in Multi-turn Dialogue Comprehension: A Survey [51.215629336320305]
Training machines to understand natural language and interact with humans is an elusive and essential task of artificial intelligence.
This paper reviews the previous methods from the technical perspective of dialogue modeling for the dialogue comprehension task.
In addition, we categorize dialogue-related pre-training techniques which are employed to enhance PrLMs in dialogue scenarios.
arXiv Detail & Related papers (2021-10-11T03:52:37Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Advances in Multi-turn Dialogue Comprehension: A Survey [51.215629336320305]
We review the previous methods from the perspective of dialogue modeling.
We discuss three typical patterns of dialogue modeling that are widely-used in dialogue comprehension tasks.
arXiv Detail & Related papers (2021-03-04T15:50:17Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.