Adaptive Instance Distillation for Object Detection in Autonomous
Driving
- URL: http://arxiv.org/abs/2201.11097v2
- Date: Wed, 22 Mar 2023 17:02:59 GMT
- Title: Adaptive Instance Distillation for Object Detection in Autonomous
Driving
- Authors: Qizhen Lan and Qing Tian
- Abstract summary: We propose Adaptive Instance Distillation (AID) to selectively impart teacher's knowledge to the student to improve the performance of knowledge distillation.
Our AID is also shown to be useful for self-distillation to improve the teacher model's performance.
- Score: 3.236217153362305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, knowledge distillation (KD) has been widely used to derive
efficient models. Through imitating a large teacher model, a lightweight
student model can achieve comparable performance with more efficiency. However,
most existing knowledge distillation methods are focused on classification
tasks. Only a limited number of studies have applied knowledge distillation to
object detection, especially in time-sensitive autonomous driving scenarios. In
this paper, we propose Adaptive Instance Distillation (AID) to selectively
impart teacher's knowledge to the student to improve the performance of
knowledge distillation. Unlike previous KD methods that treat all instances
equally, our AID can attentively adjust the distillation weights of instances
based on the teacher model's prediction loss. We verified the effectiveness of
our AID method through experiments on the KITTI and the COCO traffic datasets.
The results show that our method improves the performance of state-of-the-art
attention-guided and non-local distillation methods and achieves better
distillation results on both single-stage and two-stage detectors. Compared to
the baseline, our AID led to an average of 2.7% and 2.1% mAP increases for
single-stage and two-stage detectors, respectively. Furthermore, our AID is
also shown to be useful for self-distillation to improve the teacher model's
performance.
Related papers
- Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution [81.81748032199813]
We propose a Distillation-Free One-Step Diffusion model.
Specifically, we propose a noise-aware discriminator (NAD) to participate in adversarial training.
We improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection [13.255646312416532]
We propose a novel knowledge distillation framework for UAV-OD.
Specifically, a progressive distillation approach is designed to alleviate the feature gap between teacher and student models.
A new feature alignment method is provided to extract object-related features for enhancing student model's knowledge reception efficiency.
arXiv Detail & Related papers (2024-08-21T08:05:03Z) - Dual Knowledge Distillation for Efficient Sound Event Detection [20.236008919003083]
Sound event detection (SED) is essential for recognizing specific sounds and their temporal locations within acoustic signals.
We introduce a novel framework referred to as dual knowledge distillation for developing efficient SED systems.
arXiv Detail & Related papers (2024-02-05T07:30:32Z) - Gradient-Guided Knowledge Distillation for Object Detectors [3.236217153362305]
We propose a novel approach for knowledge distillation in object detection, named Gradient-guided Knowledge Distillation (GKD)
Our GKD uses gradient information to identify and assign more weights to features that significantly impact the detection loss, allowing the student to learn the most relevant features from the teacher.
Experiments on the KITTI and COCO-Traffic datasets demonstrate our method's efficacy in knowledge distillation for object detection.
arXiv Detail & Related papers (2023-03-07T21:09:09Z) - Unbiased Knowledge Distillation for Recommendation [66.82575287129728]
Knowledge distillation (KD) has been applied in recommender systems (RS) to reduce inference latency.
Traditional solutions first train a full teacher model from the training data, and then transfer its knowledge to supervise the learning of a compact student model.
We find such a standard distillation paradigm would incur serious bias issue -- popular items are more heavily recommended after the distillation.
arXiv Detail & Related papers (2022-11-27T05:14:03Z) - Exploring Inconsistent Knowledge Distillation for Object Detection with
Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model.
We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions.
Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z) - ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self
On-the-fly Distillation for Dense Passage Retrieval [54.54667085792404]
We propose a novel distillation method that significantly advances cross-architecture distillation for dual-encoders.
Our method 1) introduces a self on-the-fly distillation method that can effectively distill late interaction (i.e., ColBERT) to vanilla dual-encoder, and 2) incorporates a cascade distillation process to further improve the performance with a cross-encoder teacher.
arXiv Detail & Related papers (2022-05-18T18:05:13Z) - Localization Distillation for Object Detection [134.12664548771534]
Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the classification logits.
We present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student.
We show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.
arXiv Detail & Related papers (2022-04-12T17:14:34Z) - On the benefits of knowledge distillation for adversarial robustness [53.41196727255314]
We show that knowledge distillation can be used directly to boost the performance of state-of-the-art models in adversarial robustness.
We present Adversarial Knowledge Distillation (AKD), a new framework to improve a model's robust performance.
arXiv Detail & Related papers (2022-03-14T15:02:13Z) - Dual Correction Strategy for Ranking Distillation in Top-N Recommender System [22.37864671297929]
This paper presents Dual Correction strategy for Knowledge Distillation (DCD)
DCD transfers the ranking information from the teacher model to the student model in a more efficient manner.
Our experiments show that the proposed method outperforms the state-of-the-art baselines.
arXiv Detail & Related papers (2021-09-08T07:00:45Z) - Prime-Aware Adaptive Distillation [27.66963552145635]
Knowledge distillation aims to improve the performance of a student network by mimicing the knowledge from a powerful teacher network.
Previous effective hard mining methods are not appropriate for distillation.
Prime-Aware Adaptive Distillation (PAD) perceives the prime samples in distillation and then emphasizes their effect adaptively.
arXiv Detail & Related papers (2020-08-04T10:53:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.