AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
- URL: http://arxiv.org/abs/2307.11077v2
- Date: Sun, 13 Aug 2023 15:23:43 GMT
- Title: AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
- Authors: Ming Li, Jie Wu, Xionghui Wang, Chen Chen, Jie Qin, Xuefeng Xiao, Rui
Wang, Min Zheng, Xin Pan
- Abstract summary: AlignDet is a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies.
It can achieve significant improvements across diverse protocols, such as detection algorithm, model backbone, data setting, and training schedule.
- Score: 38.256555424079664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paradigm of large-scale pre-training followed by downstream fine-tuning
has been widely employed in various object detection algorithms. In this paper,
we reveal discrepancies in data, model, and task between the pre-training and
fine-tuning procedure in existing practices, which implicitly limit the
detector's performance, generalization ability, and convergence speed. To this
end, we propose AlignDet, a unified pre-training framework that can be adapted
to various existing detectors to alleviate the discrepancies. AlignDet
decouples the pre-training process into two stages, i.e., image-domain and
box-domain pre-training. The image-domain pre-training optimizes the detection
backbone to capture holistic visual abstraction, and box-domain pre-training
learns instance-level semantics and task-aware concepts to initialize the parts
out of the backbone. By incorporating the self-supervised pre-trained
backbones, we can pre-train all modules for various detectors in an
unsupervised paradigm. As depicted in Figure 1, extensive experiments
demonstrate that AlignDet can achieve significant improvements across diverse
protocols, such as detection algorithm, model backbone, data setting, and
training schedule. For example, AlignDet improves FCOS by 5.3 mAP, RetinaNet by
2.1 mAP, Faster R-CNN by 3.3 mAP, and DETR by 2.3 mAP under fewer epochs.
Related papers
- Robust and Explainable Fine-Grained Visual Classification with Transfer Learning: A Dual-Carriageway Framework [0.799543372823325]
We present an automatic best-suit training solution searching framework, the Dual-Carriageway Framework (DCF)
We validated DCF's effectiveness through experiments with three convolutional neural networks (ResNet18, ResNet34 and Inception-v3)
Results showed fine-tuning pathways outperformed training-from-scratch ones by up to 2.13% and 1.23% on the pre-existing and new datasets, respectively.
arXiv Detail & Related papers (2024-05-09T15:41:10Z) - Efficient Transferability Assessment for Selection of Pre-trained Detectors [63.21514888618542]
This paper studies the efficient transferability assessment of pre-trained object detectors.
We build up a detector transferability benchmark which contains a large and diverse zoo of pre-trained detectors.
Experimental results demonstrate that our method outperforms other state-of-the-art approaches in assessing transferability.
arXiv Detail & Related papers (2024-03-14T14:23:23Z) - Aligned Unsupervised Pretraining of Object Detectors with Self-training [41.03780087924593]
Unsupervised pretraining of object detectors has recently become a key component of object detector training.
We propose a framework that mitigates this issue and consists of three simple yet key ingredients.
We show that our strategy is also capable of pretraining from scratch (including the backbone) and works on complex images like COCO.
arXiv Detail & Related papers (2023-07-28T17:46:00Z) - Towards All-in-one Pre-training via Maximizing Multi-modal Mutual
Information [77.80071279597665]
We propose an all-in-one single-stage pre-training approach, named Maximizing Multi-modal Mutual Information Pre-training (M3I Pre-training)
Our approach achieves better performance than previous pre-training methods on various vision benchmarks, including ImageNet classification, object detection, LVIS long-tailed object detection, and ADE20k semantic segmentation.
arXiv Detail & Related papers (2022-11-17T18:59:49Z) - Label-Efficient Object Detection via Region Proposal Network
Pre-Training [58.50615557874024]
We propose a simple pretext task that provides an effective pre-training for the region proposal network (RPN)
In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance.
arXiv Detail & Related papers (2022-11-16T16:28:18Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Aligning Pretraining for Detection via Object-Level Contrastive Learning [57.845286545603415]
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.
We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task.
Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection.
arXiv Detail & Related papers (2021-06-04T17:59:52Z) - DAP: Detection-Aware Pre-training with Weak Supervision [37.336674323981285]
This paper presents a detection-aware pre-training (DAP) approach for object detection tasks.
We transform a classification dataset into a detection dataset through a weakly supervised object localization method based on Class Activation Maps.
We show that DAP can outperform the traditional classification pre-training in terms of both sample efficiency and convergence speed in downstream detection tasks including VOC and COCO.
arXiv Detail & Related papers (2021-03-30T19:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.