Evaluating and Improving Context Attention Distribution on Multi-Turn
Response Generation using Self-Contained Distractions
- URL: http://arxiv.org/abs/2211.04943v1
- Date: Wed, 9 Nov 2022 15:12:20 GMT
- Title: Evaluating and Improving Context Attention Distribution on Multi-Turn
Response Generation using Self-Contained Distractions
- Authors: Yujie Xing and Jon Atle Gulla
- Abstract summary: We focus on an essential component of multi-turn generation-based conversational agents: context attention distribution.
To improve performance on this component, we propose an optimization strategy that employs self-contained distractions.
Our experiments on the Ubuntu chatlogs dataset show that models with comparable perplexity can be distinguished by their ability on context attention distribution.
- Score: 0.18275108630751835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the rapid progress of open-domain generation-based conversational
agents, most deployed systems treat dialogue contexts as single-turns, while
systems dealing with multi-turn contexts are less studied. There is a lack of a
reliable metric for evaluating multi-turn modelling, as well as an effective
solution for improving it. In this paper, we focus on an essential component of
multi-turn generation-based conversational agents: context attention
distribution, i.e. how systems distribute their attention on dialogue's
context. For evaluation of this component, We introduce a novel
attention-mechanism-based metric: DAS ratio. To improve performance on this
component, we propose an optimization strategy that employs self-contained
distractions. Our experiments on the Ubuntu chatlogs dataset show that models
with comparable perplexity can be distinguished by their ability on context
attention distribution. Our proposed optimization strategy improves both
non-hierarchical and hierarchical models on the proposed metric by about 10%
from baselines.
Related papers
- Boosting CNN-based Handwriting Recognition Systems with Learnable Relaxation Labeling [48.78361527873024]
We propose a novel approach to handwriting recognition that integrates the strengths of two distinct methodologies.
We introduce a sparsification technique that accelerates the convergence of the algorithm and enhances the overall system's performance.
arXiv Detail & Related papers (2024-09-09T15:12:28Z) - Thread Detection and Response Generation using Transformers with Prompt
Optimisation [5.335657953493376]
This paper develops an end-to-end model that identifies threads and prioritises their response generation based on the importance.
The model achieves up to 10x speed improvement, while generating more coherent results compared to existing models.
arXiv Detail & Related papers (2024-03-09T14:50:20Z) - Enhancing End-to-End Multi-Task Dialogue Systems: A Study on Intrinsic Motivation Reinforcement Learning Algorithms for Improved Training and Adaptability [1.0985060632689174]
Investigating intrinsic motivation reinforcement learning algorithms is the goal of this study.
We adapt techniques for random network distillation and curiosity-driven reinforcement learning to measure the frequency of state visits.
Experimental results on MultiWOZ, a heterogeneous dataset, show that intrinsic motivation-based debate systems outperform policies that depend on extrinsic incentives.
arXiv Detail & Related papers (2024-01-31T18:03:39Z) - Promptformer: Prompted Conformer Transducer for ASR [40.88399609719793]
We introduce a novel mechanism inspired by hyper-prompting to fuse textual context with acoustic representations in the attention mechanism.
Results on a test set with multi-turn interactions show that our method achieves 5.9% relative word error rate reduction (rWERR) over a strong baseline.
arXiv Detail & Related papers (2024-01-14T20:14:35Z) - DialAug: Mixing up Dialogue Contexts in Contrastive Learning for Robust
Conversational Modeling [3.3578533367912025]
We propose a framework that incorporates augmented versions of a dialogue context into the learning objective.
We show that our proposed augmentation method outperforms previous data augmentation approaches.
arXiv Detail & Related papers (2022-04-15T23:39:41Z) - Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem.
Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z) - Assessing Dialogue Systems with Distribution Distances [48.61159795472962]
We propose to measure the performance of a dialogue system by computing the distribution-wise distance between its generated conversations and real-world conversations.
Experiments on several dialogue corpora show that our proposed metrics correlate better with human judgments than existing metrics.
arXiv Detail & Related papers (2021-05-06T10:30:13Z) - Bayesian Attention Modules [65.52970388117923]
We propose a scalable version of attention that is easy to implement and optimize.
Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
arXiv Detail & Related papers (2020-10-20T20:30:55Z) - Semantic Role Labeling Guided Multi-turn Dialogue ReWriter [63.07073750355096]
We propose to use semantic role labeling (SRL) to highlight the core semantic information of who did what to whom.
Experiments show that this information significantly improves a RoBERTa-based model that already outperforms previous state-of-the-art systems.
arXiv Detail & Related papers (2020-10-03T19:50:04Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.