DAP: Detection-Aware Pre-training with Weak Supervision
- URL: http://arxiv.org/abs/2103.16651v1
- Date: Tue, 30 Mar 2021 19:48:30 GMT
- Title: DAP: Detection-Aware Pre-training with Weak Supervision
- Authors: Yuanyi Zhong, Jianfeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang,
Lei Zhang
- Abstract summary: This paper presents a detection-aware pre-training (DAP) approach for object detection tasks.
We transform a classification dataset into a detection dataset through a weakly supervised object localization method based on Class Activation Maps.
We show that DAP can outperform the traditional classification pre-training in terms of both sample efficiency and convergence speed in downstream detection tasks including VOC and COCO.
- Score: 37.336674323981285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a detection-aware pre-training (DAP) approach, which
leverages only weakly-labeled classification-style datasets (e.g., ImageNet)
for pre-training, but is specifically tailored to benefit object detection
tasks. In contrast to the widely used image classification-based pre-training
(e.g., on ImageNet), which does not include any location-related training
tasks, we transform a classification dataset into a detection dataset through a
weakly supervised object localization method based on Class Activation Maps to
directly pre-train a detector, making the pre-trained model location-aware and
capable of predicting bounding boxes. We show that DAP can outperform the
traditional classification pre-training in terms of both sample efficiency and
convergence speed in downstream detection tasks including VOC and COCO. In
particular, DAP boosts the detection accuracy by a large margin when the number
of examples in the downstream task is small.
Related papers
- Aligned Unsupervised Pretraining of Object Detectors with Self-training [41.03780087924593]
Unsupervised pretraining of object detectors has recently become a key component of object detector training.
We propose a framework that mitigates this issue and consists of three simple yet key ingredients.
We show that our strategy is also capable of pretraining from scratch (including the backbone) and works on complex images like COCO.
arXiv Detail & Related papers (2023-07-28T17:46:00Z) - AlignDet: Aligning Pre-training and Fine-tuning in Object Detection [38.256555424079664]
AlignDet is a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies.
It can achieve significant improvements across diverse protocols, such as detection algorithm, model backbone, data setting, and training schedule.
arXiv Detail & Related papers (2023-07-20T17:55:14Z) - Label-Efficient Object Detection via Region Proposal Network
Pre-Training [58.50615557874024]
We propose a simple pretext task that provides an effective pre-training for the region proposal network (RPN)
In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance.
arXiv Detail & Related papers (2022-11-16T16:28:18Z) - Self-supervised Pretraining with Classification Labels for Temporal
Activity Detection [54.366236719520565]
Temporal Activity Detection aims to predict activity classes per frame.
Due to the expensive frame-level annotations required for detection, the scale of detection datasets is limited.
This work proposes a novel self-supervised pretraining method for detection leveraging classification labels.
arXiv Detail & Related papers (2021-11-26T18:59:28Z) - DETReg: Unsupervised Pretraining with Region Priors for Object Detection [103.93533951746612]
DETReg is a new self-supervised method that pretrains the entire object detection network.
During pretraining, DETReg predicts object localizations to match the localizations from an unsupervised region proposal generator.
It simultaneously aligns the corresponding feature embeddings with embeddings from a self-supervised image encoder.
arXiv Detail & Related papers (2021-06-08T17:39:14Z) - Aligning Pretraining for Detection via Object-Level Contrastive Learning [57.845286545603415]
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.
We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task.
Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection.
arXiv Detail & Related papers (2021-06-04T17:59:52Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection [86.0580214485104]
We propose a general and efficient pre-training paradigm, Montage pre-training, for object detection.
Montage pre-training needs only the target detection dataset while taking only 1/4 computational resources compared to the widely adopted ImageNet pre-training.
The efficiency and effectiveness of Montage pre-training are validated by extensive experiments on the MS-COCO dataset.
arXiv Detail & Related papers (2020-04-25T16:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.