Sparse Spatial Attention Network for Semantic Segmentation
- URL: http://arxiv.org/abs/2109.01915v1
- Date: Sat, 4 Sep 2021 18:41:05 GMT
- Title: Sparse Spatial Attention Network for Semantic Segmentation
- Authors: Mengyu Liu and Hujun Yin
- Abstract summary: The spatial attention mechanism captures long-range dependencies by aggregating global contextual information to each query location.
We present a sparse spatial attention network (SSANet) to improve the efficiency of the spatial attention mechanism without sacrificing the performance.
- Score: 11.746833714322156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The spatial attention mechanism captures long-range dependencies by
aggregating global contextual information to each query location, which is
beneficial for semantic segmentation. In this paper, we present a sparse
spatial attention network (SSANet) to improve the efficiency of the spatial
attention mechanism without sacrificing the performance. Specifically, a sparse
non-local (SNL) block is proposed to sample a subset of key and value elements
for each query element to capture long-range relations adaptively and generate
a sparse affinity matrix to aggregate contextual information efficiently.
Experimental results show that the proposed approach outperforms other context
aggregation methods and achieves state-of-the-art performance on the
Cityscapes, PASCAL Context and ADE20K datasets.
Related papers
- Learning Spatial-Semantic Features for Robust Video Object Segmentation [108.045326229865]
We propose a robust video object segmentation framework equipped with spatial-semantic features and discriminative object queries.
We show that the proposed method set a new state-of-the-art performance on multiple datasets.
arXiv Detail & Related papers (2024-07-10T15:36:00Z) - Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens [57.37893387775829]
We introduce a fast and balanced clustering method, named textbfSemantic textbfEquitable textbfClustering (SEC)
SEC clusters tokens based on their global semantic relevance in an efficient, straightforward manner.
We propose a versatile vision backbone, SECViT, to serve as a vision language connector.
arXiv Detail & Related papers (2024-05-22T04:49:00Z) - Cascaded Sparse Feature Propagation Network for Interactive Segmentation [18.584007891618096]
We propose a cascade sparse feature propagation network that learns a click-augmented feature representation for propagating user-provided information to unlabeled regions.
We validate the effectiveness of our method through comprehensive experiments on various benchmarks, and the results demonstrate the superior performance of our approach.
arXiv Detail & Related papers (2022-03-10T03:47:24Z) - Learning to Aggregate Multi-Scale Context for Instance Segmentation in
Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process.
The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z) - Unveiling the Potential of Structure-Preserving for Weakly Supervised
Object Localization [71.79436685992128]
We propose a two-stage approach, termed structure-preserving activation (SPA), towards fully leveraging the structure information incorporated in convolutional features for WSOL.
In the first stage, a restricted activation module (RAM) is designed to alleviate the structure-missing issue caused by the classification network.
In the second stage, we propose a post-process approach, termed self-correlation map generating (SCG) module to obtain structure-preserving localization maps.
arXiv Detail & Related papers (2021-03-08T03:04:14Z) - Local Context Attention for Salient Object Segmentation [5.542044768017415]
We propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture.
The proposed network introduces an Attentional Correlation Filter (ACF) module to generate explicit local attention by calculating the correlation feature map between coarse prediction and global context.
Comprehensive experiments are conducted on several salient object segmentation datasets, demonstrating the superior performance of the proposed LCANet against the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-24T09:20:06Z) - Mixup-CAM: Weakly-supervised Semantic Segmentation via Uncertainty
Regularization [73.03956876752868]
We propose a principled and end-to-end train-able framework to allow the network to pay attention to other parts of the object.
Specifically, we introduce the mixup data augmentation scheme into the classification network and design two uncertainty regularization terms to better interact with the mixup strategy.
arXiv Detail & Related papers (2020-08-03T21:19:08Z) - Weakly-Supervised Semantic Segmentation via Sub-category Exploration [73.03956876752868]
We propose a simple yet effective approach to enforce the network to pay attention to other parts of an object.
Specifically, we perform clustering on image features to generate pseudo sub-categories labels within each annotated parent class.
We conduct extensive analysis to validate the proposed method and show that our approach performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2020-08-03T20:48:31Z) - Learning to Predict Context-adaptive Convolution for Semantic
Segmentation [66.27139797427147]
Long-range contextual information is essential for achieving high-performance semantic segmentation.
We propose a Context-adaptive Convolution Network (CaC-Net) to predict a spatially-varying feature weighting vector.
Our CaC-Net achieves superior segmentation performance on three public datasets.
arXiv Detail & Related papers (2020-04-17T13:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.