Strengthening Layer Interaction via Dynamic Layer Attention
- URL: http://arxiv.org/abs/2406.13392v1
- Date: Wed, 19 Jun 2024 09:35:14 GMT
- Title: Strengthening Layer Interaction via Dynamic Layer Attention
- Authors: Kaishen Wang, Xun Xia, Jian Liu, Zhang Yi, Tao He,
- Abstract summary: Existing layer attention methods achieve layer interaction on fixed feature maps in a static manner.
To restore the dynamic context representation capability of the attention mechanism, we propose a Dynamic Layer Attention architecture.
Experimental results demonstrate the effectiveness of the proposed DLA architecture, outperforming other state-of-the-art methods in image recognition and object detection tasks.
- Score: 12.341997220052486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, employing layer attention to enhance interaction among hierarchical layers has proven to be a significant advancement in building network structures. In this paper, we delve into the distinction between layer attention and the general attention mechanism, noting that existing layer attention methods achieve layer interaction on fixed feature maps in a static manner. These static layer attention methods limit the ability for context feature extraction among layers. To restore the dynamic context representation capability of the attention mechanism, we propose a Dynamic Layer Attention (DLA) architecture. The DLA comprises dual paths, where the forward path utilizes an improved recurrent neural network block, named Dynamic Sharing Unit (DSU), for context feature extraction. The backward path updates features using these shared context representations. Finally, the attention mechanism is applied to these dynamically refreshed feature maps among layers. Experimental results demonstrate the effectiveness of the proposed DLA architecture, outperforming other state-of-the-art methods in image recognition and object detection tasks. Additionally, the DSU block has been evaluated as an efficient plugin in the proposed DLA architecture.The code is available at https://github.com/tunantu/Dynamic-Layer-Attention.
Related papers
- Adaptive Large Language Models By Layerwise Attention Shortcuts [46.76681147411957]
LLM-like setups allow the final layer to attend to all of the intermediate layers as it deems fit through the attention mechanism.
We showcase four different datasets, namely acoustic tokens, natural language, and symbolic music, and we achieve superior performance for GPT-like architecture.
arXiv Detail & Related papers (2024-09-17T03:46:01Z) - Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection.
The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z) - Task agnostic continual learning with Pairwise layer architecture [0.0]
We show that we can improve the continual learning performance by replacing the final layer of our networks with our pairwise interaction layer.
The networks using this architecture show competitive performance in MNIST and FashionMNIST-based continual image classification experiments.
arXiv Detail & Related papers (2024-05-22T13:30:01Z) - Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - Semantic-aware Texture-Structure Feature Collaboration for Underwater
Image Enhancement [58.075720488942125]
Underwater image enhancement has become an attractive topic as a significant technology in marine engineering and aquatic robotics.
We develop an efficient and compact enhancement network in collaboration with a high-level semantic-aware pretrained model.
We also apply the proposed algorithm to the underwater salient object detection task to reveal the favorable semantic-aware ability for high-level vision tasks.
arXiv Detail & Related papers (2022-11-19T07:50:34Z) - Dynamic Scene Deblurring Base on Continuous Cross-Layer Attention
Transmission [6.3482616879743885]
We introduce a new continuous cross-layer attention transmission (CCLAT) mechanism that can exploit hierarchical attention information from all the convolutional layers.
Taking RDAFB as the building block, we design an effective architecture for dynamic scene deblurring named RDAFNet.
Experiments on benchmark datasets show that the proposed model outperforms the state-of-the-art deblurring approaches.
arXiv Detail & Related papers (2022-06-23T04:55:13Z) - Modeling long-term interactions to enhance action recognition [81.09859029964323]
We propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels.
We use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects.
The proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks.
arXiv Detail & Related papers (2021-04-23T10:08:15Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.