Related papers: Gradient-Guided Knowledge Distillation for Object Detectors

Gradient-Guided Knowledge Distillation for Object Detectors

URL: http://arxiv.org/abs/2303.04240v1
Date: Tue, 7 Mar 2023 21:09:09 GMT
Title: Gradient-Guided Knowledge Distillation for Object Detectors
Authors: Qizhen Lan and Qing Tian
Abstract summary: We propose a novel approach for knowledge distillation in object detection, named Gradient-guided Knowledge Distillation (GKD) Our GKD uses gradient information to identify and assign more weights to features that significantly impact the detection loss, allowing the student to learn the most relevant features from the teacher. Experiments on the KITTI and COCO-Traffic datasets demonstrate our method's efficacy in knowledge distillation for object detection.
Score: 3.236217153362305
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning models have demonstrated remarkable success in object detection, yet their complexity and computational intensity pose a barrier to deploying them in real-world applications (e.g., self-driving perception). Knowledge Distillation (KD) is an effective way to derive efficient models. However, only a small number of KD methods tackle object detection. Also, most of them focus on mimicking the plain features of the teacher model but rarely consider how the features contribute to the final detection. In this paper, we propose a novel approach for knowledge distillation in object detection, named Gradient-guided Knowledge Distillation (GKD). Our GKD uses gradient information to identify and assign more weights to features that significantly impact the detection loss, allowing the student to learn the most relevant features from the teacher. Furthermore, we present bounding-box-aware multi-grained feature imitation (BMFI) to further improve the KD performance. Experiments on the KITTI and COCO-Traffic datasets demonstrate our method's efficacy in knowledge distillation for object detection. On one-stage and two-stage detectors, our GKD-BMFI leads to an average of 5.1% and 3.8% mAP improvement, respectively, beating various state-of-the-art KD methods.

Related papers

Architectural Insights into Knowledge Distillation for Object Detection: A Comprehensive Review [1.374949083138427]
This review introduces a novel architecture-centric taxonomy for KD methods, distinguishing between CNN-based detectors and Transformer-based detectors.<n>The proposed taxonomy and analysis aim to clarify the evolving landscape of KD in object detection, highlight current challenges, and guide future research toward efficient and scalable detection systems.
arXiv Detail & Related papers (2025-08-05T10:53:46Z)
MoKD: Multi-Task Optimization for Knowledge Distillation [33.447451819037106]
Two key challenges in Knowledge Distillation (KD) are balancing learning from the teacher's guidance and the task objective.<n>We propose Multi-Task Optimization for Knowledge Distillation (MoKD)<n>MoKD reformulates KD as a multi-objective optimization problem, enabling better balance between objectives.
arXiv Detail & Related papers (2025-05-13T02:13:39Z)
Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection [13.255646312416532]
We propose a novel knowledge distillation framework for UAV-OD. Specifically, a progressive distillation approach is designed to alleviate the feature gap between teacher and student models. A new feature alignment method is provided to extract object-related features for enhancing student model's knowledge reception efficiency.
arXiv Detail & Related papers (2024-08-21T08:05:03Z)
Relative Difficulty Distillation for Semantic Segmentation [54.76143187709987]
We propose a pixel-level KD paradigm for semantic segmentation named Relative Difficulty Distillation (RDD) RDD allows the teacher network to provide effective guidance on learning focus without additional optimization goals. Our research showcases that RDD can integrate with existing KD methods to improve their upper performance bound.
arXiv Detail & Related papers (2024-07-04T08:08:25Z)
Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection [47.0507287491627]
We propose a novel feature-based distillation paradigm with knowledge uncertainty for object detection. By leveraging the Monte Carlo dropout technique, we introduce knowledge uncertainty into the training process of the student model. Our method performs effectively during the KD process without requiring intricate structures or extensive computational resources.
arXiv Detail & Related papers (2024-06-11T06:51:02Z)
Efficient Object Detection in Optical Remote Sensing Imagery via Attention-based Feature Distillation [29.821082433621868]
We propose Attention-based Feature Distillation (AFD) for object detection. We introduce a multi-instance attention mechanism that effectively distinguishes between background and foreground elements. AFD attains the performance of other state-of-the-art models while being efficient.
arXiv Detail & Related papers (2023-10-28T11:15:37Z)
CrossKD: Cross-Head Knowledge Distillation for Object Detection [69.16346256926842]
Knowledge Distillation (KD) has been validated as an effective model compression technique for learning compact object detectors. We present a prediction mimicking distillation scheme, called CrossKD, which delivers the intermediate features of the student's detection head to the teacher's detection head. Our CrossKD boosts the average precision of GFL ResNet-50 with 1x training schedule from 40.2 to 43.7, outperforming all existing KD methods.
arXiv Detail & Related papers (2023-06-20T08:19:51Z)
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model. We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions. Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z)
Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation [34.441349114336994]
We propose Rank Mimicking (RM) and Prediction-guided Feature Imitation (PFI) for distilling one-stage detectors. RM takes the rank of candidate boxes from teachers as a new form of knowledge to distill. PFI attempts to correlate feature differences with prediction differences, making feature imitation directly help to improve the student's accuracy.
arXiv Detail & Related papers (2021-12-09T11:19:15Z)
Distilling Image Classifiers in Object Detectors [81.63849985128527]
We study the case of object detection and, instead of following the standard detector-to-detector distillation approach, introduce a classifier-to-detector knowledge transfer framework. In particular, we propose strategies to exploit the classification teacher to improve both the detector's recognition accuracy and localization performance.
arXiv Detail & Related papers (2021-06-09T16:50:10Z)
Heterogeneous Knowledge Distillation using Information Flow Modeling [82.83891707250926]
We propose a novel KD method that works by modeling the information flow through the various layers of the teacher model. The proposed method is capable of overcoming the aforementioned limitations by using an appropriate supervision scheme during the different phases of the training process.
arXiv Detail & Related papers (2020-05-02T06:56:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.