Unveiling the Potential of Structure-Preserving for Weakly Supervised
Object Localization
- URL: http://arxiv.org/abs/2103.04523v1
- Date: Mon, 8 Mar 2021 03:04:14 GMT
- Title: Unveiling the Potential of Structure-Preserving for Weakly Supervised
Object Localization
- Authors: Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei
Yuan, Feiyue Huang, Changsheng Xu
- Abstract summary: We propose a two-stage approach, termed structure-preserving activation (SPA), towards fully leveraging the structure information incorporated in convolutional features for WSOL.
In the first stage, a restricted activation module (RAM) is designed to alleviate the structure-missing issue caused by the classification network.
In the second stage, we propose a post-process approach, termed self-correlation map generating (SCG) module to obtain structure-preserving localization maps.
- Score: 71.79436685992128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly supervised object localization remains an open problem due to the
deficiency of finding object extent information using a classification network.
While prior works struggle to localize objects by various spatial
regularization strategies, we argue that how to extract object structural
information from the trained classification network is neglected. In this
paper, we propose a two-stage approach, termed structure-preserving activation
(SPA), towards fully leveraging the structure information incorporated in
convolutional features for WSOL. In the first stage, a restricted activation
module (RAM) is designed to alleviate the structure-missing issue caused by the
classification network, based on the observation that the unbounded
classification map and global average pooling layer drive the network to focus
only on object parts. In the second stage, we propose a post-process approach,
termed self-correlation map generating (SCG) module to obtain
structure-preserving localization maps on the basis of the activation maps
acquired from the first stage. Specifically, we utilize the high-order
self-correlation (HSC) to extract the inherent structural information retained
in the learned model and then aggregate HSC of multiple points for precise
object localization. Extensive experiments on two publicly available benchmarks
including CUB-200-2011 and ILSVRC show that the proposed SPA achieves
substantial and consistent performance gains compared with baseline approaches.
Related papers
- Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation [39.7657197805346]
Point cloud few-shot semantic segmentation (PC-FSS) aims to segment targets of novel categories in a given query point cloud with only a few annotated support samples.
We propose a simple yet effective framework in the spirit of Decoupled Localization and Expansion (DLE)
DLE, including a structural localization module (SLM) and a self-expansion module (SEM), enjoys several merits.
arXiv Detail & Related papers (2024-08-25T07:34:32Z) - Spatial Structure Constraints for Weakly Supervised Semantic
Segmentation [100.0316479167605]
A class activation map (CAM) can only locate the most discriminative part of objects.
We propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.
Our approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively.
arXiv Detail & Related papers (2024-01-20T05:25:25Z) - Weakly Supervised Open-Vocabulary Object Detection [31.605276665964787]
We propose a novel weakly supervised open-vocabulary object detection framework, namely WSOVOD, to extend traditional WSOD.
To achieve this, we explore three vital strategies, including dataset-level feature adaptation, image-level salient object localization, and region-level vision-language alignment.
arXiv Detail & Related papers (2023-12-19T18:59:53Z) - Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation [84.62067728093358]
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels.
New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization.
This paper presents two astonishing experimental observations on the object localization learning process.
arXiv Detail & Related papers (2023-09-22T15:44:10Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain
Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain.
We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one.
Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z) - Local Context Attention for Salient Object Segmentation [5.542044768017415]
We propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture.
The proposed network introduces an Attentional Correlation Filter (ACF) module to generate explicit local attention by calculating the correlation feature map between coarse prediction and global context.
Comprehensive experiments are conducted on several salient object segmentation datasets, demonstrating the superior performance of the proposed LCANet against the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-24T09:20:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.