Exploring Bottom-up and Top-down Cues with Attentive Learning for Webly
Supervised Object Detection
- URL: http://arxiv.org/abs/2003.09790v1
- Date: Sun, 22 Mar 2020 03:11:24 GMT
- Title: Exploring Bottom-up and Top-down Cues with Attentive Learning for Webly
Supervised Object Detection
- Authors: Zhonghua Wu and Qingyi Tao and Guosheng Lin and Jianfei Cai
- Abstract summary: We propose a novel webly supervised object detection (WebSOD) method for novel classes.
Our proposed method combines bottom-up and top-down cues for novel class detection.
We demonstrate our proposed method on PASCAL VOC dataset with three different novel/base splits.
- Score: 76.9756607002489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fully supervised object detection has achieved great success in recent years.
However, abundant bounding boxes annotations are needed for training a detector
for novel classes. To reduce the human labeling effort, we propose a novel
webly supervised object detection (WebSOD) method for novel classes which only
requires the web images without further annotations. Our proposed method
combines bottom-up and top-down cues for novel class detection. Within our
approach, we introduce a bottom-up mechanism based on the well-trained fully
supervised object detector (i.e. Faster RCNN) as an object region estimator for
web images by recognizing the common objectiveness shared by base and novel
classes. With the estimated regions on the web images, we then utilize the
top-down attention cues as the guidance for region classification. Furthermore,
we propose a residual feature refinement (RFR) block to tackle the domain
mismatch between web domain and the target domain. We demonstrate our proposed
method on PASCAL VOC dataset with three different novel/base splits. Without
any target-domain novel-class images and annotations, our proposed webly
supervised object detection model is able to achieve promising performance for
novel classes. Moreover, we also conduct transfer learning experiments on large
scale ILSVRC 2013 detection dataset and achieve state-of-the-art performance.
Related papers
- Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation [58.37525311718006]
We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD)
We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario.
Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
arXiv Detail & Related papers (2024-11-04T12:59:13Z) - Few-shot Object Detection in Remote Sensing: Lifting the Curse of
Incompletely Annotated Novel Objects [23.171410277239534]
We propose a self-training-based FSOD (ST-FSOD) approach to object detection.
Our proposed method outperforms the state-of-the-art in various FSOD settings by a large margin.
arXiv Detail & Related papers (2023-09-19T13:00:25Z) - Improved Region Proposal Network for Enhanced Few-Shot Object Detection [23.871860648919593]
Few-shot object detection (FSOD) methods have emerged as a solution to the limitations of classic object detection approaches.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during the FSOD training stage.
Our improved hierarchical sampling strategy for the region proposal network (RPN) also boosts the perception of the object detection model for large objects.
arXiv Detail & Related papers (2023-08-15T02:35:59Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Identification of Novel Classes for Improving Few-Shot Object Detection [12.013345715187285]
Few-shot object detection (FSOD) methods offer a remedy by realizing robust object detection using only a few training samples per class.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during training to improve FSOD performance.
Our experimental results indicate that our method is effective and outperforms the existing state-of-the-art (SOTA) FSOD methods.
arXiv Detail & Related papers (2023-03-18T14:12:52Z) - Incremental-DETR: Incremental Few-Shot Object Detection via
Self-Supervised Learning [60.64535309016623]
We propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector.
To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision.
We further introduce a incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without catastrophic forgetting.
arXiv Detail & Related papers (2022-05-09T05:08:08Z) - Experience feedback using Representation Learning for Few-Shot Object
Detection on Aerial Images [2.8560476609689185]
The performance of our method is assessed on DOTA, a large-scale remote sensing images dataset.
It highlights in particular some intrinsic weaknesses for the few-shot object detection task.
arXiv Detail & Related papers (2021-09-27T13:04:53Z) - Dynamic Relevance Learning for Few-Shot Object Detection [6.550840743803705]
We propose a dynamic relevance learning model, which utilizes the relationship between all support images and Region of Interest (RoI) on the query images to construct a dynamic graph convolutional network (GCN)
The proposed model achieves the best overall performance, which shows its effectiveness of learning more generalized features.
arXiv Detail & Related papers (2021-08-04T18:29:42Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - StarNet: towards Weakly Supervised Few-Shot Object Detection [87.80771067891418]
We introduce StarNet - a few-shot model featuring an end-to-end differentiable non-parametric star-model detection and classification head.
Through this head, the backbone is meta-trained using only image-level labels to produce good features for jointly localizing and classifying previously unseen categories of few-shot test tasks.
Being a few-shot detector, StarNet does not require any bounding box annotations, neither during pre-training nor for novel classes adaptation.
arXiv Detail & Related papers (2020-03-15T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.