Bridging Cross-task Protocol Inconsistency for Distillation in Dense
Object Detection
- URL: http://arxiv.org/abs/2308.14286v2
- Date: Tue, 12 Mar 2024 09:29:51 GMT
- Title: Bridging Cross-task Protocol Inconsistency for Distillation in Dense
Object Detection
- Authors: Longrong Yang, Xianpan Zhou, Xuewei Li, Liang Qiao, Zheyang Li, Ziwei
Yang, Gaoang Wang, Xi Li
- Abstract summary: We propose a novel distillation method with cross-task consistent protocols, tailored for dense object detection.
For classification distillation, we formulate the classification logit maps in both teacher and student models as multiple binary-classification maps and applying a binary-classification distillation loss to each map.
Our proposed method is simple but effective, and experimental results demonstrate its superiority over existing methods.
- Score: 19.07452370081663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation (KD) has shown potential for learning compact models
in dense object detection. However, the commonly used softmax-based
distillation ignores the absolute classification scores for individual
categories. Thus, the optimum of the distillation loss does not necessarily
lead to the optimal student classification scores for dense object detectors.
This cross-task protocol inconsistency is critical, especially for dense object
detectors, since the foreground categories are extremely imbalanced. To address
the issue of protocol differences between distillation and classification, we
propose a novel distillation method with cross-task consistent protocols,
tailored for the dense object detection. For classification distillation, we
address the cross-task protocol inconsistency problem by formulating the
classification logit maps in both teacher and student models as multiple
binary-classification maps and applying a binary-classification distillation
loss to each map. For localization distillation, we design an IoU-based
Localization Distillation Loss that is free from specific network structures
and can be compared with existing localization distillation losses. Our
proposed method is simple but effective, and experimental results demonstrate
its superiority over existing methods. Code is available at
https://github.com/TinyTigerPan/BCKD.
Related papers
- Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.
Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes.
We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Knowledge Distillation from Single to Multi Labels: an Empirical Study [14.12487391004319]
We introduce a novel distillation method based on Class Activation Maps (CAMs)
Our findings indicate that the logit-based method is not well-suited for multi-label classification.
We propose that a suitable dark knowledge should incorporate class-wise information and be highly correlated with the final classification results.
arXiv Detail & Related papers (2023-03-15T04:39:01Z) - Task-Balanced Distillation for Object Detection [18.939830805129787]
RetinaNet with ResNet-50 achieves 41.0 mAP under the benchmark, outperforming the recent FGD and COCO.
A novel Task-decoupled Feature Distillation (TFD) is proposed by flexibly balancing the contributions of classification and regression tasks.
arXiv Detail & Related papers (2022-08-05T06:43:40Z) - Localization Distillation for Object Detection [134.12664548771534]
Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the classification logits.
We present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student.
We show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.
arXiv Detail & Related papers (2022-04-12T17:14:34Z) - Label Assignment Distillation for Object Detection [0.0]
We come up with a simple but effective knowledge distillation approach focusing on label assignment in object detection.
Our method shows encouraging results on the MSCOCO 2017 benchmark.
arXiv Detail & Related papers (2021-09-16T10:11:58Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - Distilling Object Detectors via Decoupled Features [69.62967325617632]
We present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector.
Experiments on various detectors with different backbones show that the proposed DeFeat is able to surpass the state-of-the-art distillation methods for object detection.
arXiv Detail & Related papers (2021-03-26T13:58:49Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.