Related papers: Improving Attention Mechanism with Query-Value Interaction

Improving Attention Mechanism with Query-Value Interaction

URL: http://arxiv.org/abs/2010.03766v1
Date: Thu, 8 Oct 2020 05:12:52 GMT
Title: Improving Attention Mechanism with Query-Value Interaction
Authors: Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang
Abstract summary: We propose a query-value interaction function which can learn query-aware attention values. Our approach can consistently improve the performance of many attention-based models.
Score: 92.67156911466397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Attention mechanism has played critical roles in various state-of-the-art NLP models such as Transformer and BERT. It can be formulated as a ternary function that maps the input queries, keys and values into an output by using a summation of values weighted by the attention weights derived from the interactions between queries and keys. Similar with query-key interactions, there is also inherent relatedness between queries and values, and incorporating query-value interactions has the potential to enhance the output by learning customized values according to the characteristics of queries. However, the query-value interactions are ignored by existing attention methods, which may be not optimal. In this paper, we propose to improve the existing attention mechanism by incorporating query-value interactions. We propose a query-value interaction function which can learn query-aware attention values, and combine them with the original values and attention weights to form the final output. Extensive experiments on four datasets for different tasks show that our approach can consistently improve the performance of many attention-based models by incorporating query-value interactions.

Related papers

Knowledge Graph Completion with Relation-Aware Anchor Enhancement [50.50944396454757]
We propose a relation-aware anchor enhanced knowledge graph completion method (RAA-KGC) We first generate anchor entities within the relation-aware neighborhood of the head entity. Then, by pulling the query embedding towards the neighborhoods of the anchors, it is tuned to be more discriminative for target entity matching.
arXiv Detail & Related papers (2025-04-08T15:22:08Z)
MMRel: A Relation Understanding Benchmark in the MLLM Era [72.95901753186227]
Multi-Modal Relation Understanding (MMRel) is a benchmark that features large-scale, high-quality, and diverse data on inter-object relations. MMRel is ideal for evaluating MLLMs on relation understanding, as well as for fine-tuning MLLMs to enhance relation comprehension capability.
arXiv Detail & Related papers (2024-06-13T13:51:59Z)
Interactive Multi-Head Self-Attention with Linear Complexity [60.112941134420204]
We show that the interactions between cross-heads of the attention matrix enhance the information flow of the attention operation. We propose an effective method to decompose the attention operation into query- and key-less components.
arXiv Detail & Related papers (2024-02-27T13:47:23Z)
JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE. Two variants of our model are designed for open-world and closed-world scenarios. Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z)
A Question-Answering Approach to Key Value Pair Extraction from Form-like Document Images [8.73248722579337]
We present a new question-answering (QA) based key-value pair extraction approach, called KVPFormer. We propose a coarse-to-fine answer prediction approach to achieve higher answer prediction accuracy. Our proposed Ours achieves state-of-the-art results on FUNSD and XFUND datasets, outperforming the previous best-performing method by 7.2% and 13.2% in F1 score, respectively.
arXiv Detail & Related papers (2023-04-17T02:55:31Z)
Query-Utterance Attention with Joint modeling for Query-Focused Meeting Summarization [4.763356598070365]
We propose a query-aware framework with joint modeling token and utterance based on Query-Utterance Attention. We show that the query relevance of different granularities contributes to generating a summary more related to the query.
arXiv Detail & Related papers (2023-03-08T10:21:45Z)
Compositional Attention: Disentangling Search and Retrieval [66.7108739597771]
Multi-head, key-value attention is the backbone of the Transformer model and its variants. Standard attention heads learn a rigid mapping between search and retrieval. We propose a novel attention mechanism, called Compositional Attention, that replaces the standard head structure.
arXiv Detail & Related papers (2021-10-18T15:47:38Z)
Relation-aware Heterogeneous Graph for User Profiling [24.076585294260816]
We propose to leverage the relation-aware heterogeneous graph method for user profiling. We adopt the query, key, and value mechanism in a transformer fashion for heterogeneous message passing. We conduct experiments on two real-world e-commerce datasets and observe a significant performance boost of our approach.
arXiv Detail & Related papers (2021-10-14T06:59:30Z)
Neural Graph Matching based Collaborative Filtering [13.086302251856756]
We identify two different types of attribute interactions, inner and cross interactions. Existing models do not distinguish these two types of attribute interactions. We propose a neural Graph Matching based Collaborative Filtering model (GMCF) Our model outperforms state-of-the-art models.
arXiv Detail & Related papers (2021-05-10T01:51:46Z)
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer [89.21584915290319]
We propose a self-attention attribution method to interpret the information interactions inside Transformer. We show that the attribution results can be used as adversarial patterns to implement non-targeted attacks towards BERT.
arXiv Detail & Related papers (2020-04-23T14:58:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.