Related papers: Linear Attention Mechanism: An Efficient Attention for Semantic Segmentation

Linear Attention Mechanism: An Efficient Attention for Semantic Segmentation

URL: http://arxiv.org/abs/2007.14902v3
Date: Thu, 20 Aug 2020 05:43:22 GMT
Title: Linear Attention Mechanism: An Efficient Attention for Semantic Segmentation
Authors: Rui Li, Jianlin Su, Chenxi Duan, Shunyi Zheng
Abstract summary: Linear Attention Mechanism is approximate to dot-product attention with much less memory and computational costs. Experiments conducted on semantic segmentation demonstrated the effectiveness of linear attention mechanism.
Score: 2.9488233765621295
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, to remedy this deficiency, we propose a Linear Attention Mechanism which is approximate to dot-product attention with much less memory and computational costs. The efficient design makes the incorporation between attention mechanisms and neural networks more flexible and versatile. Experiments conducted on semantic segmentation demonstrated the effectiveness of linear attention mechanism. Code is available at https://github.com/lironui/Linear-Attention-Mechanism.

Related papers

Parameter-Free Bio-Inspired Channel Attention for Enhanced Cardiac MRI Reconstruction [8.904269561863103]
We propose a non-linear attention architecture for cardiac MRI reconstruction and hypothesize that insights from ecological principles can guide the development of effective attention mechanisms.<n>Specifically, we investigate a non-linear ecological difference equation that describes single-species population growth to devise a parameter-free attention module.
arXiv Detail & Related papers (2025-05-29T12:03:24Z)
Towards understanding how attention mechanism works in deep learning [8.79364699260219]
We study the process of computing similarity using classic metrics and vector space properties in manifold learning, clustering, and supervised learning. We decompose the self-attention mechanism into a learnable pseudo-metric function and an information propagation process based on similarity computation. We propose a modified attention mechanism called metric-attention by leveraging the concept of metric learning to facilitate the ability to learn desired metrics more effectively.
arXiv Detail & Related papers (2024-12-24T08:52:06Z)
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences [60.489682735061415]
We propose CHELA, which replaces state space models with short-long convolutions and implements linear attention in a divide-and-conquer manner. Our experiments on the Long Range Arena benchmark and language modeling tasks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-06-12T12:12:38Z)
Interactive Multi-Head Self-Attention with Linear Complexity [60.112941134420204]
We show that the interactions between cross-heads of the attention matrix enhance the information flow of the attention operation. We propose an effective method to decompose the attention operation into query- and key-less components.
arXiv Detail & Related papers (2024-02-27T13:47:23Z)
FAST: Factorizable Attention for Speeding up Transformers [1.3637227185793512]
We present a linearly scaled attention mechanism that maintains the full representation of the attention matrix without compromising on sparsification. Results indicate that our attention mechanism has a robust performance and holds significant promise for diverse applications where self-attention is used.
arXiv Detail & Related papers (2024-02-12T18:59:39Z)
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks. In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z)
Linear Log-Normal Attention with Unbiased Concentration [3.034257650900382]
We study the self-attention mechanism by analyzing the distribution of the attention matrix and its concentration ability. We propose instruments to measure these quantities and introduce a novel self-attention mechanism, Linear Log-Normal Attention. Our experimental results on popular natural language benchmarks reveal that our proposed Linear Log-Normal Attention outperforms other linearized attention alternatives.
arXiv Detail & Related papers (2023-11-22T17:30:41Z)
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms [0.5994412766684842]
We identify misalignments between the attention and the signal amplitude in the existing multi-head self-attention. We propose to use a Focus-Attention (FA) mechanism and a novel-Attention (CA) mechanism in combination with the multi-head self-attention. By employing the CA mechanism, the network can modulate the information flow by assigning different weights to each attention head and improve the utilization of surrounding contexts.
arXiv Detail & Related papers (2022-08-21T08:04:22Z)
Guiding Visual Question Answering with Attention Priors [76.21671164766073]
We propose to guide the attention mechanism using explicit linguistic-visual grounding. This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects. The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process.
arXiv Detail & Related papers (2022-05-25T09:53:47Z)
Multi-stage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images [9.398340832493457]
We propose a Linear Attention Mechanism (LAM) to address this issue. LAM is approximately equivalent to dot-product attention with computational efficiency. We design a Multi-stage Attention ResU-Net for semantic segmentation from fine-resolution remote sensing images.
arXiv Detail & Related papers (2020-11-29T07:24:21Z)
Attention that does not Explain Away [54.42960937271612]
Models based on the Transformer architecture have achieved better accuracy than the ones based on competing architectures for a large set of tasks. A unique feature of the Transformer is its universal application of a self-attention mechanism, which allows for free information flow at arbitrary distances. We propose a doubly-normalized attention scheme that is simple to implement and provides theoretical guarantees for avoiding the "explaining away" effect.
arXiv Detail & Related papers (2020-09-29T21:05:39Z)
Untangling tradeoffs between recurrence and self-attention in neural networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks. We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies. We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.