SDA-$x$Net: Selective Depth Attention Networks for Adaptive Multi-scale
Feature Representation
- URL: http://arxiv.org/abs/2209.10327v1
- Date: Wed, 21 Sep 2022 12:49:55 GMT
- Title: SDA-$x$Net: Selective Depth Attention Networks for Adaptive Multi-scale
Feature Representation
- Authors: Qingbei Guo, Xiao-Jun Wu, Zhiquan Feng, Tianyang Xu and Cong Hu
- Abstract summary: Existing multi-scale solutions lead to a risk of just increasing the receptive field sizes while neglecting small receptive fields.
We introduce a new attention dimension, i.e. depth, in addition to existing attention dimensions such as channel, spatial, and branch.
We present a novel selective depth attention network to symmetrically handle multi-scale objects in various vision tasks.
- Score: 14.7929472540577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing multi-scale solutions lead to a risk of just increasing the
receptive field sizes while neglecting small receptive fields. Thus, it is a
challenging problem to effectively construct adaptive neural networks for
recognizing various spatial-scale objects. To tackle this issue, we first
introduce a new attention dimension, i.e., depth, in addition to existing
attention dimensions such as channel, spatial, and branch, and present a novel
selective depth attention network to symmetrically handle multi-scale objects
in various vision tasks. Specifically, the blocks within each stage of a given
neural network, i.e., ResNet, output hierarchical feature maps sharing the same
resolution but with different receptive field sizes. Based on this structural
property, we design a stage-wise building module, namely SDA, which includes a
trunk branch and a SE-like attention branch. The block outputs of the trunk
branch are fused to globally guide their depth attention allocation through the
attention branch. According to the proposed attention mechanism, we can
dynamically select different depth features, which contributes to adaptively
adjusting the receptive field sizes for the variable-sized input objects. In
this way, the cross-block information interaction leads to a long-range
dependency along the depth direction. Compared with other multi-scale
approaches, our SDA method combines multiple receptive fields from previous
blocks into the stage output, thus offering a wider and richer range of
effective receptive fields. Moreover, our method can be served as a pluggable
module to other multi-scale networks as well as attention networks, coined as
SDA-$x$Net. Their combination further extends the range of the effective
receptive fields towards small receptive fields, enabling interpretable neural
networks. Our source code is available at
\url{https://github.com/QingbeiGuo/SDA-xNet.git}.
Related papers
- MSA$^2$Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation [8.404273502720136]
We introduce MSA$2$Net, a new deep segmentation framework featuring an expedient design of skip-connections.
We propose a Multi-Scale Adaptive Spatial Attention Gate (MASAG) to ensure that spatially relevant features are selectively highlighted.
Our MSA$2$Net outperforms state-of-the-art (SOTA) works or matches their performance.
arXiv Detail & Related papers (2024-07-31T14:41:10Z) - Poly Kernel Inception Network for Remote Sensing Detection [64.60749113583601]
We introduce the Poly Kernel Inception Network (PKINet) to handle the challenges of object detection in remote sensing images.
PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context.
These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks.
arXiv Detail & Related papers (2024-03-10T16:56:44Z) - Densely Decoded Networks with Adaptive Deep Supervision for Medical
Image Segmentation [19.302294715542175]
We propose densely decoded networks (ddn), by selectively introducing 'crutch' network connections.
Such 'crutch' connections in each upsampling stage of the network decoder enhance target localization.
We also present a training strategy based on adaptive deep supervision (ads), which exploits and adapts specific attributes of input dataset.
arXiv Detail & Related papers (2024-02-05T00:44:57Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Domain Attention Consistency for Multi-Source Domain Adaptation [100.25573559447551]
Key design is a feature channel attention module, which aims to identify transferable features (attributes)
Experiments on three MSDA benchmarks show that our DAC-Net achieves new state of the art performance on all of them.
arXiv Detail & Related papers (2021-11-06T15:56:53Z) - Attention Cube Network for Image Restoration [39.49175636499541]
We propose an attention cube network (A-CubeNet) for image restoration for more powerful feature expression and feature correlation learning.
We design a novel attention mechanism from three dimensions, namely spatial dimension, channel-wise dimension and hierarchical dimension.
Experiments demonstrate the superiority of our method over state-of-the-art image restoration methods in both quantitative comparison and visual analysis.
arXiv Detail & Related papers (2020-09-13T03:42:14Z) - Channel-wise Alignment for Adaptive Object Detection [66.76486843397267]
Generic object detection has been immensely promoted by the development of deep convolutional neural networks.
Existing methods on this task usually draw attention on the high-level alignment based on the whole image or object of interest.
In this paper, we realize adaptation from a thoroughly different perspective, i.e., channel-wise alignment.
arXiv Detail & Related papers (2020-09-07T02:42:18Z) - Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network.
We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z) - Feature-Dependent Cross-Connections in Multi-Path Neural Networks [7.230526683545722]
Multi-path networks tend to learn redundant features.
We introduce a mechanism to intelligently allocate incoming feature maps to such paths.
We show improved image recognition accuracy at a similar complexity compared to conventional and state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T17:38:03Z) - Multi-Subspace Neural Network for Image Recognition [33.61205842747625]
In image classification task, feature extraction is always a big issue. Intra-class variability increases the difficulty in designing the extractors.
Recently, deep learning has drawn lots of attention on automatically learning features from data.
In this study, we proposed multi-subspace neural network (MSNN) which integrates key components of the convolutional neural network (CNN), receptive field, with subspace concept.
arXiv Detail & Related papers (2020-06-17T02:55:34Z) - Hold me tight! Influence of discriminative features on deep network
boundaries [63.627760598441796]
We propose a new perspective that relates dataset features to the distance of samples to the decision boundary.
This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets.
arXiv Detail & Related papers (2020-02-15T09:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.