Focus-Constrained Attention Mechanism for CVAE-based Response Generation
- URL: http://arxiv.org/abs/2009.12102v1
- Date: Fri, 25 Sep 2020 09:38:59 GMT
- Title: Focus-Constrained Attention Mechanism for CVAE-based Response Generation
- Authors: Zhi Cui, Yanran Li, Jiayi Zhang, Jianwei Cui, Chen Wei, Bin Wang
- Abstract summary: latent variable is supposed to capture the discourse-level information and encourage the informativeness of target responses.
We transform the coarse-grained discourse-level information into fine-grained word-level information.
Our model can generate more diverse and informative responses compared with several state-of-the-art models.
- Score: 27.701626908931267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To model diverse responses for a given post, one promising way is to
introduce a latent variable into Seq2Seq models. The latent variable is
supposed to capture the discourse-level information and encourage the
informativeness of target responses. However, such discourse-level information
is often too coarse for the decoder to be utilized. To tackle it, our idea is
to transform the coarse-grained discourse-level information into fine-grained
word-level information. Specifically, we firstly measure the semantic
concentration of corresponding target response on the post words by introducing
a fine-grained focus signal. Then, we propose a focus-constrained attention
mechanism to take full advantage of focus in well aligning the input to the
target response. The experimental results demonstrate that by exploiting the
fine-grained signal, our model can generate more diverse and informative
responses compared with several state-of-the-art models.
Related papers
- Towards Interpreting Language Models: A Case Study in Multi-Hop Reasoning [0.0]
Language models (LMs) struggle to perform multi-hop reasoning consistently.
We propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LM attention heads.
arXiv Detail & Related papers (2024-11-06T16:30:26Z) - Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling [39.80957479349776]
We investigate the prosody modeling capabilities of the discrete space of an RVQ-VAE model, modifying it to operate on the phoneme-level.
We show that the phoneme-level discrete latent representations achieves a high degree of disentanglement, capturing fine-grained prosodic information that is robust and transferable.
arXiv Detail & Related papers (2024-09-13T09:27:05Z) - iSeg: An Iterative Refinement-based Framework for Training-free Segmentation [85.58324416386375]
We present a deep experimental analysis on iteratively refining cross-attention map with self-attention map.
We propose an effective iterative refinement framework for training-free segmentation, named iSeg.
Our proposed iSeg achieves an absolute gain of 3.8% in terms of mIoU compared to the best existing training-free approach in literature.
arXiv Detail & Related papers (2024-09-05T03:07:26Z) - Towards Better Text-to-Image Generation Alignment via Attention Modulation [16.020834525343997]
We propose an attribution-focusing mechanism, a training-free phase-wise mechanism by modulation of attention for diffusion model.
An object-focused masking scheme and a phase-wise dynamic weight control mechanism are integrated into the cross-attention modules.
The experimental results in various alignment scenarios demonstrate that our model attains better image-text alignment with minimal additional computational cost.
arXiv Detail & Related papers (2024-04-22T06:18:37Z) - Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - Advanced Conditional Variational Autoencoders (A-CVAE): Towards
interpreting open-domain conversation generation via disentangling latent
feature representation [15.742077523458995]
This paper proposes to harness the generative model with a priori knowledge through a cognitive approach involving mesoscopic scale feature disentanglement.
We propose a new metric for open-domain dialogues, which can objectively evaluate the interpretability of the latent space distribution.
arXiv Detail & Related papers (2022-07-26T07:39:36Z) - Bayesian Attention Belief Networks [59.183311769616466]
Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks.
This paper introduces Bayesian attention belief networks, which construct a decoder network by modeling unnormalized attention weights.
We show that our method outperforms deterministic attention and state-of-the-art attention in accuracy, uncertainty estimation, generalization across domains, and adversarial attacks.
arXiv Detail & Related papers (2021-06-09T17:46:22Z) - DGA-Net Dynamic Gaussian Attention Network for Sentence Semantic
Matching [52.661387170698255]
We propose a novel Dynamic Gaussian Attention Network (DGA-Net) to improve attention mechanism.
We first leverage pre-trained language model to encode the input sentences and construct semantic representations from a global perspective.
Finally, we develop a Dynamic Gaussian Attention (DGA) to dynamically capture the important parts and corresponding local contexts from a detailed perspective.
arXiv Detail & Related papers (2021-06-09T08:43:04Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.