CAT: Learning to Collaborate Channel and Spatial Attention from
Multi-Information Fusion
- URL: http://arxiv.org/abs/2212.06335v1
- Date: Tue, 13 Dec 2022 02:34:10 GMT
- Title: CAT: Learning to Collaborate Channel and Spatial Attention from
Multi-Information Fusion
- Authors: Zizhang Wu, Man Wang, Weiwei Sun, Yuchen Li, Tianhao Xu, Fan Wang,
Keke Huang
- Abstract summary: We propose a plug-and-play attention module, which we term "CAT"-activating the Collaboration between spatial and channel Attentions.
Specifically, we represent traits as trainable coefficients (i.e., colla-factors) to adaptively combine contributions of different attention modules.
Our CAT outperforms existing state-of-the-art attention mechanisms in object detection, instance segmentation, and image classification.
- Score: 23.72040577828098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Channel and spatial attention mechanism has proven to provide an evident
performance boost of deep convolution neural networks (CNNs). Most existing
methods focus on one or run them parallel (series), neglecting the
collaboration between the two attentions. In order to better establish the
feature interaction between the two types of attention, we propose a
plug-and-play attention module, which we term "CAT"-activating the
Collaboration between spatial and channel Attentions based on learned Traits.
Specifically, we represent traits as trainable coefficients (i.e.,
colla-factors) to adaptively combine contributions of different attention
modules to fit different image hierarchies and tasks better. Moreover, we
propose the global entropy pooling (GEP) apart from global average pooling
(GAP) and global maximum pooling (GMP) operators, an effective component in
suppressing noise signals by measuring the information disorder of feature
maps. We introduce a three-way pooling operation into attention modules and
apply the adaptive mechanism to fuse their outcomes. Extensive experiments on
MS COCO, Pascal-VOC, Cifar-100, and ImageNet show that our CAT outperforms
existing state-of-the-art attention mechanisms in object detection, instance
segmentation, and image classification. The model and code will be released
soon.
Related papers
- Holistic Prototype Attention Network for Few-Shot VOS [74.25124421163542]
Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting to a small set of support images.
We propose a holistic prototype attention network (HPAN) for advancing FSVOS.
arXiv Detail & Related papers (2023-07-16T03:48:57Z) - Efficient Multi-Scale Attention Module with Cross-Spatial Learning [4.046170185945849]
A novel efficient multi-scale attention (EMA) module is proposed.
We focus on retaining the information on per channel and decreasing the computational overhead.
We conduct extensive ablation studies and experiments on image classification and object detection tasks.
arXiv Detail & Related papers (2023-05-23T00:35:47Z) - Feedback Chain Network For Hippocampus Segmentation [59.74305660815117]
We propose a novel hierarchical feedback chain network for the hippocampus segmentation task.
The proposed approach achieves state-of-the-art performance on three publicly available datasets.
arXiv Detail & Related papers (2022-11-15T04:32:10Z) - Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC)
We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss.
Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z) - An Attention Module for Convolutional Neural Networks [5.333582981327498]
We propose an attention module for convolutional neural networks by developing an AW-convolution.
Experiments on several datasets for image classification and object detection tasks show the effectiveness of our proposed attention module.
arXiv Detail & Related papers (2021-08-18T15:36:18Z) - CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional
Network for Clustering [51.62959830761789]
We propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN)
CaEGCN contains four main modules: cross-attention fusion, Content Auto-encoder, Graph Convolutional Auto-encoder and self-supervised model.
Experimental results on different types of datasets prove the superiority and robustness of the proposed CaEGCN.
arXiv Detail & Related papers (2021-01-18T05:21:59Z) - Attention-Guided Network for Iris Presentation Attack Detection [13.875545441867137]
We propose attention-guided iris presentation attack detection (AG-PAD) to augment CNNs with attention mechanisms.
Experiments involving both a JHU-APL proprietary dataset and the benchmark LivDet-Iris-2017 dataset suggest that the proposed method achieves promising results.
arXiv Detail & Related papers (2020-10-23T19:23:51Z) - Rotate to Attend: Convolutional Triplet Attention Module [21.228370317693244]
We present triplet attention, a novel method for computing attention weights using a three-branch structure.
Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module.
We demonstrate the effectiveness of our method on various challenging tasks including image classification on ImageNet-1k and object detection on MSCOCO and PASCAL VOC datasets.
arXiv Detail & Related papers (2020-10-06T21:31:00Z) - Single Image Super-Resolution via a Holistic Attention Network [87.42409213909269]
We propose a new holistic attention network (HAN) to model the holistic interdependencies among layers, channels, and positions.
The proposed HAN adaptively emphasizes hierarchical features by considering correlations among layers.
Experiments demonstrate that the proposed HAN performs favorably against the state-of-the-art single image super-resolution approaches.
arXiv Detail & Related papers (2020-08-20T04:13:15Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.