Related papers: DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances

DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances

URL: http://arxiv.org/abs/2012.01775v1
Date: Thu, 3 Dec 2020 09:06:23 GMT
Title: DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances
Authors: Xiaodong Gu, Kang Min Yoo, Jung-Woo Ha
Abstract summary: This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
Score: 18.199473005335093
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent advances in pre-trained language models have significantly improved neural response generation. However, existing methods usually view the dialogue context as a linear sequence of tokens and learn to generate the next word through token-level self-attention. Such token-level encoding hinders the exploration of discourse-level coherence among utterances. This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. DialogBERT employs a hierarchical Transformer architecture. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression and distributed utterance order ranking in analogy to the original BERT training. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines, such as BART and DialoGPT, in terms of quantitative evaluation. The human evaluation suggests that DialogBERT generates more coherent, informative, and human-like responses than the baselines with significant margins.

Related papers

Score Before You Speak: Improving Persona Consistency in Dialogue Generation using Response Quality Scores [2.150144047598779]
Persona-based dialogue generation is an important milestone towards building conversational artificial intelligence.<n>We propose a novel framework SBS (Score-Before-Speaking), which outperforms previous methods.<n>We show that score-conditioned training allows existing models to better capture a spectrum of persona-consistent dialogues.
arXiv Detail & Related papers (2025-08-09T08:30:06Z)
Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z)
EM Pre-training for Multi-party Dialogue Response Generation [86.25289241604199]
In multi-party dialogues, the addressee of a response utterance should be specified before it is generated. We propose an Expectation-Maximization (EM) approach that iteratively performs the expectation steps to generate addressee labels.
arXiv Detail & Related papers (2023-05-21T09:22:41Z)
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs. We employ domain-adaptive training strategies to help the model adapt to the dialogue domains. Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z)
Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm [0.0]
Current neural network models of dialogue generation show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
arXiv Detail & Related papers (2022-12-28T22:46:57Z)
DialAug: Mixing up Dialogue Contexts in Contrastive Learning for Robust Conversational Modeling [3.3578533367912025]
We propose a framework that incorporates augmented versions of a dialogue context into the learning objective. We show that our proposed augmentation method outperforms previous data augmentation approaches.
arXiv Detail & Related papers (2022-04-15T23:39:41Z)
Towards Robust Online Dialogue Response Generation [62.99904593650087]
We argue that this can be caused by a discrepancy between training and real-world testing. We propose a hierarchical sampling-based method consisting of both utterance-level sampling and semi-utterance-level sampling.
arXiv Detail & Related papers (2022-03-07T06:51:41Z)
Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task. Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts. Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z)
Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation. Previous works usually employ various neural network architectures to model the history. This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z)
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations. We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.