Constrained Sampling for Class-Agnostic Weakly Supervised Object
Localization
- URL: http://arxiv.org/abs/2209.09195v1
- Date: Fri, 9 Sep 2022 19:58:38 GMT
- Title: Constrained Sampling for Class-Agnostic Weakly Supervised Object
Localization
- Authors: Shakeeb Murtaza, Soufiane Belharbi, Marco Pedersoli, Aydin Sarraf,
Eric Granger
- Abstract summary: Self-supervised vision transformers can generate accurate localization maps of the objects in an image.
We propose leveraging the multiple maps generated by the different transformer heads to acquire pseudo-labels for training a weakly-supervised object localization model.
- Score: 10.542859578763068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised vision transformers can generate accurate localization maps
of the objects in an image. However, since they decompose the scene into
multiple maps containing various objects, and they do not rely on any explicit
supervisory signal, they cannot distinguish between the object of interest from
other objects, as required in weakly-supervised object localization (WSOL). To
address this issue, we propose leveraging the multiple maps generated by the
different transformer heads to acquire pseudo-labels for training a WSOL model.
In particular, a new discriminative proposals sampling method is introduced
that relies on a pretrained CNN classifier to identify discriminative regions.
Then, foreground and background pixels are sampled from these regions in order
to train a WSOL model for generating activation maps that can accurately
localize objects belonging to a specific class. Empirical results on the
challenging CUB benchmark dataset indicate that our proposed approach can
outperform state-of-art methods over a wide range of threshold values. Our
method provides class activation maps with a better coverage of foreground
object regions w.r.t. the background.
Related papers
- DiPS: Discriminative Pseudo-Label Sampling with Self-Supervised
Transformers for Weakly Supervised Object Localization [13.412674368913747]
Discriminative Pseudo-label Sampling (DiPS) is introduced to leverage class-agnostic maps for weakly-supervised object localization.
DiPS relies on a pre-trained classifier to identify the most discriminative regions of each attention map.
It provides a rich pool of diverse and discriminative proposals to cover different parts of the object.
arXiv Detail & Related papers (2023-10-09T22:52:43Z) - Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation [84.62067728093358]
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels.
New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization.
This paper presents two astonishing experimental observations on the object localization learning process.
arXiv Detail & Related papers (2023-09-22T15:44:10Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - MOST: Multiple Object localization with Self-supervised Transformers for
object discovery [97.47075050779085]
We present Multiple Object localization with Self-supervised Transformers (MOST)
MOST uses features of transformers trained using self-supervised learning to localize multiple objects in real world images.
We show MOST can be used for self-supervised pre-training of object detectors, and yields consistent improvements on fully, semi-supervised object detection and unsupervised region proposal generation.
arXiv Detail & Related papers (2023-04-11T17:57:27Z) - Discriminative Sampling of Proposals in Self-Supervised Transformers for
Weakly Supervised Object Localization [10.542859578763068]
Self-supervised vision transformers can generate accurate localization maps of the objects in an image.
We propose leveraging the multiple maps generated by the different transformer heads to acquire pseudo-labels for training a weakly-supervised object localization model.
arXiv Detail & Related papers (2022-09-09T18:33:23Z) - Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly
Supervised Object Detection [54.24966006457756]
We propose a WSOD framework called the Spatial Likelihood Voting with Self-knowledge Distillation Network (SLV-SD Net)
SLV-SD Net converges region proposal localization without bounding box annotations.
Experiments on the PASCAL VOC 2007/2012 and MS-COCO datasets demonstrate the excellent performance of SLV-SD Net.
arXiv Detail & Related papers (2022-04-14T11:56:19Z) - Weakly Supervised Object Localization as Domain Adaption [19.854125742336688]
Weakly supervised object localization (WSOL) focuses on localizing objects only with the supervision of image-level classification masks.
Most previous WSOL methods follow the classification activation map (CAM) that localizes objects based on the classification structure with the multi-instance learning (MIL) mechanism.
This work provides a novel perspective that models WSOL as a domain adaption (DA) task, where the score estimator trained on the source/image domain is tested on the target/pixel domain to locate objects.
arXiv Detail & Related papers (2022-03-03T13:50:22Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps [78.2581910688094]
This work introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC.
arXiv Detail & Related papers (2020-06-09T12:35:55Z) - Rethinking the Route Towards Weakly Supervised Object Localization [28.90792512056726]
We show that weakly supervised object localization should be divided into two parts: class-agnostic object localization and object classification.
For class-agnostic object localization, we should use class-agnostic methods to generate noisy pseudo annotations and then perform bounding box regression on them without class labels.
Our PSOL models have good transferability across different datasets without fine-tuning.
arXiv Detail & Related papers (2020-02-26T08:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.