Learning to Select Context in a Hierarchical and Global Perspective for
Open-domain Dialogue Generation
- URL: http://arxiv.org/abs/2102.09282v1
- Date: Thu, 18 Feb 2021 11:56:42 GMT
- Title: Learning to Select Context in a Hierarchical and Global Perspective for
Open-domain Dialogue Generation
- Authors: Lei Shen, Haolan Zhan, Xin Shen, Yang Feng
- Abstract summary: We propose a novel model with hierarchical self-attention mechanism and distant supervision to detect relevant words and utterances in short and long distances.
Our model significantly outperforms other baselines in terms of fluency, coherence, and informativeness.
- Score: 15.01710843286394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-domain multi-turn conversations mainly have three features, which are
hierarchical semantic structure, redundant information, and long-term
dependency. Grounded on these, selecting relevant context becomes a challenge
step for multi-turn dialogue generation. However, existing methods cannot
differentiate both useful words and utterances in long distances from a
response. Besides, previous work just performs context selection based on a
state in the decoder, which lacks a global guidance and could lead some focuses
on irrelevant or unnecessary information. In this paper, we propose a novel
model with hierarchical self-attention mechanism and distant supervision to not
only detect relevant words and utterances in short and long distances, but also
discern related information globally when decoding. Experimental results on two
public datasets of both automatic and human evaluations show that our model
significantly outperforms other baselines in terms of fluency, coherence, and
informativeness.
Related papers
- CAST: Corpus-Aware Self-similarity Enhanced Topic modelling [16.562349140796115]
We introduce CAST: Corpus-Aware Self-similarity Enhanced Topic modelling, a novel topic modelling method.
We find self-similarity to be an effective metric to prevent functional words from acting as candidate topic words.
Our approach significantly enhances the coherence and diversity of generated topics, as well as the topic model's ability to handle noisy data.
arXiv Detail & Related papers (2024-10-19T15:27:11Z) - Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics.
We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context.
Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z) - Tri-level Joint Natural Language Understanding for Multi-turn
Conversational Datasets [5.3361357265365035]
We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.
We evaluate our model on two multi-turn datasets for which we are the first to conduct joint slot-filling and intent detection.
arXiv Detail & Related papers (2023-05-28T13:59:58Z) - Dialogue Term Extraction using Transfer Learning and Topological Data
Analysis [0.8185867455104834]
We explore different features that can enable systems to discover realizations of domains, slots, and values in dialogues in a purely data-driven fashion.
To examine the utility of each feature set, we train a seed model based on the widely used MultiWOZ data-set.
Our method outperforms the previously proposed approach that relies solely on word embeddings.
arXiv Detail & Related papers (2022-08-22T17:04:04Z) - Self-Supervised Speech Representation Learning: A Review [105.1545308184483]
Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains.
Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods.
This review presents approaches for self-supervised speech representation learning and their connection to other research areas.
arXiv Detail & Related papers (2022-05-21T16:52:57Z) - Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem.
Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z) - Probing Task-Oriented Dialogue Representation from Language Models [106.02947285212132]
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way.
arXiv Detail & Related papers (2020-10-26T21:34:39Z) - Topic-Aware Multi-turn Dialogue Modeling [91.52820664879432]
This paper presents a novel solution for multi-turn dialogue modeling, which segments and extracts topic-aware utterances in an unsupervised way.
Our topic-aware modeling is implemented by a newly proposed unsupervised topic-aware segmentation algorithm and Topic-Aware Dual-attention Matching (TADAM) Network.
arXiv Detail & Related papers (2020-09-26T08:43:06Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z) - Local-Global Video-Text Interactions for Temporal Grounding [77.5114709695216]
This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.
We tackle this problem using a novel regression-based model that learns to extract a collection of mid-level features for semantic phrases in a text query.
The proposed method effectively predicts the target time interval by exploiting contextual information from local to global.
arXiv Detail & Related papers (2020-04-16T08:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.