Hierarchical Complementary Learning for Weakly Supervised Object
Localization
- URL: http://arxiv.org/abs/2011.08014v1
- Date: Mon, 16 Nov 2020 14:58:51 GMT
- Title: Hierarchical Complementary Learning for Weakly Supervised Object
Localization
- Authors: Sabrina Narimene Benassou, Wuzhen Shi, Feng Jiang, Abdallah Benzine
- Abstract summary: Weakly supervised object localization (WSOL) is a challenging problem which aims to localize objects with only image-level labels.
This paper proposes a Hierarchical Complementary Learning Network method (HCLNet) that helps the CNN to perform better classification and localization of objects on the images.
- Score: 12.104019927107517
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised object localization (WSOL) is a challenging problem which
aims to localize objects with only image-level labels. Due to the lack of
ground truth bounding boxes, class labels are mainly employed to train the
model. This model generates a class activation map (CAM) which activates the
most discriminate features. However, the main drawback of CAM is the ability to
detect just a part of the object. To solve this problem, some researchers have
removed parts from the detected object \cite{b1, b2, b4}, or the image
\cite{b3}. The aim of removing parts from image or detected parts of the object
is to force the model to detect the other features. However, these methods
require one or many hyper-parameters to erase the appropriate pixels on the
image, which could involve a loss of information. In contrast, this paper
proposes a Hierarchical Complementary Learning Network method (HCLNet) that
helps the CNN to perform better classification and localization of objects on
the images. HCLNet uses a complementary map to force the network to detect the
other parts of the object. Unlike previous works, this method does not need any
extras hyper-parameters to generate different CAMs, as well as does not
introduce a big loss of information. In order to fuse these different maps, two
different fusion strategies known as the addition strategy and the l1-norm
strategy have been used. These strategies allowed to detect the whole object
while excluding the background. Extensive experiments show that HCLNet obtains
better performance than state-of-the-art methods.
Related papers
- PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - Rethinking the Localization in Weakly Supervised Object Localization [51.29084037301646]
Weakly supervised object localization (WSOL) is one of the most popular and challenging tasks in computer vision.
Recent dividing WSOL into two parts (class-agnostic object localization and object classification) has become the state-of-the-art pipeline for this task.
We propose to replace SCR with a binary-class detector (BCD) for localizing multiple objects, where the detector is trained by discriminating the foreground and background.
arXiv Detail & Related papers (2023-08-11T14:38:51Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Image Segmentation-based Unsupervised Multiple Objects Discovery [1.7674345486888503]
Unsupervised object discovery aims to localize objects in images.
We propose a fully unsupervised, bottom-up approach, for multiple objects discovery.
We provide state-of-the-art results for both unsupervised class-agnostic object detection and unsupervised image segmentation.
arXiv Detail & Related papers (2022-12-20T09:48:24Z) - Contrastive learning of Class-agnostic Activation Map for Weakly
Supervised Object Localization and Semantic Segmentation [32.76127086403596]
We propose Contrastive learning for Class-agnostic Activation Map (C$2$AM) generation using unlabeled image data.
We form the positive and negative pairs based on the above relations and force the network to disentangle foreground and background.
As the network is guided to discriminate cross-image foreground-background, the class-agnostic activation maps learned by our approach generate more complete object regions.
arXiv Detail & Related papers (2022-03-25T08:46:24Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - DetCo: Unsupervised Contrastive Learning for Object Detection [64.22416613061888]
Unsupervised contrastive learning achieves great success in learning image representations with CNN.
We present a novel contrastive learning approach, named DetCo, which fully explores the contrasts between global image and local image patches.
DetCo consistently outperforms supervised method by 1.6/1.2/1.0 AP on Mask RCNN-C4/FPN/RetinaNet with 1x schedule.
arXiv Detail & Related papers (2021-02-09T12:47:20Z) - Addressing Visual Search in Open and Closed Set Settings [8.928169373673777]
We present a method for predicting pixel-level objectness from a low resolution gist image.
We then use to select regions for performing object detection locally at high resolution.
Second, we propose a novel strategy for open-set visual search that seeks to find all instances of a target class which may be previously unseen.
arXiv Detail & Related papers (2020-12-11T17:21:28Z) - Entropy Guided Adversarial Model for Weakly Supervised Object
Localization [11.77745060973134]
We propose to apply the shannon entropy on the CAMs generated by the network to guide it during training.
Our method does not erase any part of the image neither does it change the network architecure.
Our Entropy Guided Adversarial model (EGA model) improved performance on state of the arts benchmarks for both localization and classification accuracy.
arXiv Detail & Related papers (2020-08-04T19:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.