Conversations Are Not Flat: Modeling the Dynamic Information Flow across
Dialogue Utterances
- URL: http://arxiv.org/abs/2106.02227v1
- Date: Fri, 4 Jun 2021 03:04:06 GMT
- Title: Conversations Are Not Flat: Modeling the Dynamic Information Flow across
Dialogue Utterances
- Authors: Zekang Li, Jinchao Zhang, Zhengcong Fei, Yang Feng, Jie Zhou
- Abstract summary: Open-domain dialogue models can generate acceptable responses according to the historical context.
We propose the DialoFlow model, in which we introduce a dynamic flow mechanism to model the context flow.
Code and pre-trained models will be public.
- Score: 28.255324166852535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, open-domain dialogue models can generate acceptable responses
according to the historical context based on the large-scale pre-trained
language models. However, they generally concatenate the dialogue history
directly as the model input to predict the response, which we named as the flat
pattern and ignores the dynamic information flow across dialogue utterances. In
this work, we propose the DialoFlow model, in which we introduce a dynamic flow
mechanism to model the context flow, and design three training objectives to
capture the information dynamics across dialogue utterances by addressing the
semantic influence brought about by each utterance in large-scale pre-training.
Experiments on the multi-reference Reddit Dataset and DailyDialog Dataset
demonstrate that our DialoFlow significantly outperforms the DialoGPT on the
dialogue generation task. Besides, we propose the Flow score, an effective
automatic metric for evaluating interactive human-bot conversation quality
based on the pre-trained DialoFlow, which presents high chatbot-level
correlation ($r=0.9$) with human ratings among 11 chatbots. Code and
pre-trained models will be public.
\footnote{\url{https://github.com/ictnlp/DialoFlow}}
Related papers
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues [77.15457469745364]
We propose a novel approach focusing on inferring the TOD-Flow graph from dialogue data annotated with dialog acts.
The inferred TOD-Flow graph can be easily integrated with any dialogue model to improve its prediction performance, transparency, and controllability.
arXiv Detail & Related papers (2023-12-07T20:06:23Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Weakly Supervised Data Augmentation Through Prompting for Dialogue
Understanding [103.94325597273316]
We present a novel approach that iterates on augmentation quality by applying weakly-supervised filters.
We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue.
For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.
arXiv Detail & Related papers (2022-10-25T17:01:30Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Precognition in Task-oriented Dialogue Understanding: Posterior
Regularization by Future Context [8.59600111891194]
We propose to jointly model historical and future information through the posterior regularization method.
We optimize the KL distance between these to regularize our model during training.
Experiments on two dialogue datasets validate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-07T09:58:50Z) - Multi-Referenced Training for Dialogue Response Generation [36.24321477524634]
We show that gap between the real world probability distribution and the single-referenced data's probability distribution prevents the model from learning the one-to-many relations efficiently.
We generate diverse pseudo references from a powerful pretrained model to build multi-referenced data that provides a better approximation of the real-world distribution.
arXiv Detail & Related papers (2020-09-15T14:17:53Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z) - An Empirical Investigation of Pre-Trained Transformer Language Models
for Open-Domain Dialogue Generation [23.343006562849126]
We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation.
Training paradigm of pre-training and fine-tuning is employed to conduct learning.
Experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat.
arXiv Detail & Related papers (2020-03-09T15:20:21Z) - Variational Hierarchical Dialog Autoencoder for Dialog State Tracking
Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs.
We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs.
Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.