Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation
- URL: http://arxiv.org/abs/2309.12943v1
- Date: Fri, 22 Sep 2023 15:44:10 GMT
- Title: Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation
- Authors: Wei Zhai, Pingyu Wu, Kai Zhu, Yang Cao, Feng Wu, Zheng-Jun Zha
- Abstract summary: Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels.
New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization.
This paper presents two astonishing experimental observations on the object localization learning process.
- Score: 84.62067728093358
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised object localization and semantic segmentation aim to
localize objects using only image-level labels. Recently, a new paradigm has
emerged by generating a foreground prediction map (FPM) to achieve pixel-level
localization. While existing FPM-based methods use cross-entropy to evaluate
the foreground prediction map and to guide the learning of the generator, this
paper presents two astonishing experimental observations on the object
localization learning process: For a trained network, as the foreground mask
expands, 1) the cross-entropy converges to zero when the foreground mask covers
only part of the object region. 2) The activation value continuously increases
until the foreground mask expands to the object boundary. Therefore, to achieve
a more effective localization performance, we argue for the usage of activation
value to learn more object regions. In this paper, we propose a Background
Activation Suppression (BAS) method. Specifically, an Activation Map Constraint
(AMC) module is designed to facilitate the learning of generator by suppressing
the background activation value. Meanwhile, by using foreground region guidance
and area constraint, BAS can learn the whole region of the object. In the
inference phase, we consider the prediction maps of different categories
together to obtain the final localization results. Extensive experiments show
that BAS achieves significant and consistent improvement over the baseline
methods on the CUB-200-2011 and ILSVRC datasets. In addition, our method also
achieves state-of-the-art weakly supervised semantic segmentation performance
on the PASCAL VOC 2012 and MS COCO 2014 datasets. Code and models are available
at https://github.com/wpy1999/BAS-Extension.
Related papers
- Spatial Structure Constraints for Weakly Supervised Semantic
Segmentation [100.0316479167605]
A class activation map (CAM) can only locate the most discriminative part of objects.
We propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.
Our approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively.
arXiv Detail & Related papers (2024-01-20T05:25:25Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - MOST: Multiple Object localization with Self-supervised Transformers for
object discovery [97.47075050779085]
We present Multiple Object localization with Self-supervised Transformers (MOST)
MOST uses features of transformers trained using self-supervised learning to localize multiple objects in real world images.
We show MOST can be used for self-supervised pre-training of object detectors, and yields consistent improvements on fully, semi-supervised object detection and unsupervised region proposal generation.
arXiv Detail & Related papers (2023-04-11T17:57:27Z) - CREAM: Weakly Supervised Object Localization via Class RE-Activation
Mapping [18.67907876709536]
Class RE-Activation Mapping (CREAM) is a clustering-based approach to boost the activation values of the integral object regions.
CREAM achieves the state-of-the-art performance on CUB, ILSVRC and OpenImages benchmark datasets.
arXiv Detail & Related papers (2022-05-27T11:57:41Z) - Background Activation Suppression for Weakly Supervised Object
Localization [11.31345656299108]
We argue for using activation value to achieve more efficient learning.
In this paper, we propose a Background Activation Suppression (BAS) method.
BAS achieves significant and consistent improvement over the baseline methods on the CUB-200-2011 and ILSVRC datasets.
arXiv Detail & Related papers (2021-12-01T15:53:40Z) - Online Refinement of Low-level Feature Based Activation Map for Weakly
Supervised Object Localization [15.665479740413229]
We present a two-stage learning framework for weakly supervised object localization (WSOL)
In the first stage, an activation map generator produces activation maps based on the low-level feature maps in the classifier.
In the second stage, we employ an evaluator to evaluate the activation maps predicted by the activation map generator.
Based on the low-level object information preserved in the first stage, the second stage model gradually generates a well-separated, complete, and compact activation map of object in the image.
arXiv Detail & Related papers (2021-10-12T05:09:21Z) - Unveiling the Potential of Structure-Preserving for Weakly Supervised
Object Localization [71.79436685992128]
We propose a two-stage approach, termed structure-preserving activation (SPA), towards fully leveraging the structure information incorporated in convolutional features for WSOL.
In the first stage, a restricted activation module (RAM) is designed to alleviate the structure-missing issue caused by the classification network.
In the second stage, we propose a post-process approach, termed self-correlation map generating (SCG) module to obtain structure-preserving localization maps.
arXiv Detail & Related papers (2021-03-08T03:04:14Z) - Local Context Attention for Salient Object Segmentation [5.542044768017415]
We propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture.
The proposed network introduces an Attentional Correlation Filter (ACF) module to generate explicit local attention by calculating the correlation feature map between coarse prediction and global context.
Comprehensive experiments are conducted on several salient object segmentation datasets, demonstrating the superior performance of the proposed LCANet against the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-24T09:20:06Z) - Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps [78.2581910688094]
This work introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC.
arXiv Detail & Related papers (2020-06-09T12:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.