Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps
- URL: http://arxiv.org/abs/2006.05220v2
- Date: Sat, 13 Jun 2020 04:13:23 GMT
- Title: Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps
- Authors: Xiaolin Zhang, Yunchao Wei, Yi Yang, Fei Wu
- Abstract summary: This work introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC.
- Score: 78.2581910688094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, remarkable progress has been made in weakly supervised object
localization (WSOL) to promote object localization maps. The common practice of
evaluating these maps applies an indirect and coarse way, i.e., obtaining tight
bounding boxes which can cover high-activation regions and calculating
intersection-over-union (IoU) scores between the predicted and ground-truth
boxes. This measurement can evaluate the ability of localization maps to some
extent, but we argue that the maps should be measured directly and delicately,
i.e., comparing the maps with the ground-truth object masks pixel-wisely. To
fulfill the direct evaluation, we annotate pixel-level object masks on the
ILSVRC validation set. We propose to use IoU-Threshold curves for evaluating
the real quality of localization maps. Beyond the amended evaluation metric and
annotated object masks, this work also introduces a novel self-enhancement
method to harvest accurate object localization maps and object boundaries with
only category labels as supervision. We propose a two-stage approach to
generate the localization maps by simply comparing the similarity of point-wise
features between the high-activation and the rest pixels. Based on the
predicted localization maps, we explore to estimate object boundaries on a very
large dataset. A hard-negative suppression loss is proposed for obtaining fine
boundaries. We conduct extensive experiments on the ILSVRC and CUB benchmarks.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art
localization accuracy of 54.88% on ILSVRC. The code and the annotated masks are
released at https://github.com/xiaomengyc/SEM.
Related papers
- Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation [84.62067728093358]
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels.
New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization.
This paper presents two astonishing experimental observations on the object localization learning process.
arXiv Detail & Related papers (2023-09-22T15:44:10Z) - EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized
Maps [9.450650025266379]
We present EgoVM, an end-to-end localization network that achieves comparable localization accuracy to prior state-of-the-art methods.
We employ a set of learnable semantic embeddings to encode the semantic types of map elements and supervise them with semantic segmentation.
We adopt a robust histogram-based pose solver to estimate the optimal pose by searching exhaustively over candidate poses.
arXiv Detail & Related papers (2023-07-18T06:07:25Z) - Online Map Vectorization for Autonomous Driving: A Rasterization
Perspective [58.71769343511168]
We introduce a newization-based evaluation metric, which has superior sensitivity and is better suited to real-world autonomous driving scenarios.
We also propose MapVR (Map Vectorization via Rasterization), a novel framework that applies differentiableization to preciseized outputs and then performs geometry-aware supervision on HD maps.
arXiv Detail & Related papers (2023-06-18T08:51:14Z) - Constrained Sampling for Class-Agnostic Weakly Supervised Object
Localization [10.542859578763068]
Self-supervised vision transformers can generate accurate localization maps of the objects in an image.
We propose leveraging the multiple maps generated by the different transformer heads to acquire pseudo-labels for training a weakly-supervised object localization model.
arXiv Detail & Related papers (2022-09-09T19:58:38Z) - Discriminative Sampling of Proposals in Self-Supervised Transformers for
Weakly Supervised Object Localization [10.542859578763068]
Self-supervised vision transformers can generate accurate localization maps of the objects in an image.
We propose leveraging the multiple maps generated by the different transformer heads to acquire pseudo-labels for training a weakly-supervised object localization model.
arXiv Detail & Related papers (2022-09-09T18:33:23Z) - Background Activation Suppression for Weakly Supervised Object
Localization [11.31345656299108]
We argue for using activation value to achieve more efficient learning.
In this paper, we propose a Background Activation Suppression (BAS) method.
BAS achieves significant and consistent improvement over the baseline methods on the CUB-200-2011 and ILSVRC datasets.
arXiv Detail & Related papers (2021-12-01T15:53:40Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - A Self-Training Approach for Point-Supervised Object Detection and
Counting in Crowds [54.73161039445703]
We propose a novel self-training approach that enables a typical object detector trained only with point-level annotations.
During training, we utilize the available point annotations to supervise the estimation of the center points of objects.
Experimental results show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks.
arXiv Detail & Related papers (2020-07-25T02:14:42Z) - Weakly-Supervised Salient Object Detection via Scribble Annotations [54.40518383782725]
We propose a weakly-supervised salient object detection model to learn saliency from scribble labels.
We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps.
Our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T12:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.