Activation Modulation and Recalibration Scheme for Weakly Supervised
Semantic Segmentation
- URL: http://arxiv.org/abs/2112.08996v1
- Date: Thu, 16 Dec 2021 16:26:14 GMT
- Title: Activation Modulation and Recalibration Scheme for Weakly Supervised
Semantic Segmentation
- Authors: Jie Qin, Jie Wu, Xuefeng Xiao, Lujun Li, Xingang Wang
- Abstract summary: We propose a novel activation modulation and recalibration scheme for weakly supervised semantic segmentation.
We show that AMR establishes a new state-of-the-art performance on the PASCAL VOC 2012 dataset.
Experiments also reveal that our scheme is plug-and-play and can be incorporated with other approaches to boost their performance.
- Score: 24.08326440298189
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Image-level weakly supervised semantic segmentation (WSSS) is a fundamental
yet challenging computer vision task facilitating scene understanding and
automatic driving. Most existing methods resort to classification-based Class
Activation Maps (CAMs) to play as the initial pseudo labels, which tend to
focus on the discriminative image regions and lack customized characteristics
for the segmentation task. To alleviate this issue, we propose a novel
activation modulation and recalibration (AMR) scheme, which leverages a
spotlight branch and a compensation branch to obtain weighted CAMs that can
provide recalibration supervision and task-specific concepts. Specifically, an
attention modulation module (AMM) is employed to rearrange the distribution of
feature importance from the channel-spatial sequential perspective, which helps
to explicitly model channel-wise interdependencies and spatial encodings to
adaptively modulate segmentation-oriented activation responses. Furthermore, we
introduce a cross pseudo supervision for dual branches, which can be regarded
as a semantic similar regularization to mutually refine two branches. Extensive
experiments show that AMR establishes a new state-of-the-art performance on the
PASCAL VOC 2012 dataset, surpassing not only current methods trained with the
image-level of supervision but also some methods relying on stronger
supervision, such as saliency label. Experiments also reveal that our scheme is
plug-and-play and can be incorporated with other approaches to boost their
performance.
Related papers
- An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Cross-modal Consensus Network for Weakly Supervised Temporal Action
Localization [74.34699679568818]
Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to localize action instances in the given video with video-level categorical supervision.
We propose a cross-modal consensus network (CO2-Net) to tackle this problem.
arXiv Detail & Related papers (2021-07-27T04:21:01Z) - Weakly supervised segmentation with cross-modality equivariant
constraints [7.757293476741071]
Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation.
We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs.
Our approach outperforms relevant recent literature under the same learning conditions.
arXiv Detail & Related papers (2021-04-06T13:14:20Z) - Unsupervised Domain Adaptation in Semantic Segmentation via Orthogonal
and Clustered Embeddings [25.137859989323537]
We propose an effective Unsupervised Domain Adaptation (UDA) strategy, based on a feature clustering method.
We introduce two novel learning objectives to enhance the discriminative clustering performance.
arXiv Detail & Related papers (2020-11-25T10:06:22Z) - Self-supervised Equivariant Attention Mechanism for Weakly Supervised
Semantic Segmentation [93.83369981759996]
We propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap.
Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation.
We propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning.
arXiv Detail & Related papers (2020-04-09T14:57:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.