Dual Relation Knowledge Distillation for Object Detection
- URL: http://arxiv.org/abs/2302.05637v2
- Date: Thu, 1 Jun 2023 15:08:40 GMT
- Title: Dual Relation Knowledge Distillation for Object Detection
- Authors: Zhenliang Ni, Fukui Yang, Shengzhao Wen, Gang Zhang
- Abstract summary: The pixel-wise relation distillation embeds pixel-wise features in the graph space and applies graph convolution to capture the global pixel relation.
The instance-wise relation distillation is designed, which calculates the similarity of different instances to obtain a relation matrix.
Our method achieves state-of-the-art performance, which improves Faster R-CNN based on ResNet50 from 38.4% to 41.6% mAP.
- Score: 7.027174952925931
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation is an effective method for model compression. However,
it is still a challenging topic to apply knowledge distillation to detection
tasks. There are two key points resulting in poor distillation performance for
detection tasks. One is the serious imbalance between foreground and background
features, another one is that small object lacks enough feature representation.
To solve the above issues, we propose a new distillation method named dual
relation knowledge distillation (DRKD), including pixel-wise relation
distillation and instance-wise relation distillation. The pixel-wise relation
distillation embeds pixel-wise features in the graph space and applies graph
convolution to capture the global pixel relation. By distilling the global
pixel relation, the student detector can learn the relation between foreground
and background features, and avoid the difficulty of distilling features
directly for the feature imbalance issue. Besides, we find that instance-wise
relation supplements valuable knowledge beyond independent features for small
objects. Thus, the instance-wise relation distillation is designed, which
calculates the similarity of different instances to obtain a relation matrix.
More importantly, a relation filter module is designed to highlight valuable
instance relations. The proposed dual relation knowledge distillation is
general and can be easily applied for both one-stage and two-stage detectors.
Our method achieves state-of-the-art performance, which improves Faster R-CNN
based on ResNet50 from 38.4% to 41.6% mAP and improves RetinaNet based on
ResNet50 from 37.4% to 40.3% mAP on COCO 2017.
Related papers
- Knowledge Distillation with Refined Logits [31.205248790623703]
We introduce Refined Logit Distillation (RLD) to address the limitations of current logit distillation methods.
Our approach is motivated by the observation that even high-performing teacher models can make incorrect predictions.
Our method can effectively eliminate misleading information from the teacher while preserving crucial class correlations.
arXiv Detail & Related papers (2024-08-14T17:59:32Z) - Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - Learning Lightweight Object Detectors via Multi-Teacher Progressive
Distillation [56.053397775016755]
We propose a sequential approach to knowledge distillation that progressively transfers the knowledge of a set of teacher detectors to a given lightweight student.
To the best of our knowledge, we are the first to successfully distill knowledge from Transformer-based teacher detectors to convolution-based students.
arXiv Detail & Related papers (2023-08-17T17:17:08Z) - Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition [124.80263629921498]
We propose Pixel Distillation that extends knowledge distillation into the input level while simultaneously breaking architecture constraints.
Such a scheme can achieve flexible cost control for deployment, as it allows the system to adjust both network architecture and image quality according to the overall requirement of resources.
arXiv Detail & Related papers (2021-12-17T14:31:40Z) - Focal and Global Knowledge Distillation for Detectors [23.315649744061982]
We propose Focal and Global Distillation (FGD) for object detection.
FGD separates the foreground and background, forcing the student to focus on the teacher's critical pixels and channels.
As our method only needs to calculate the loss on the feature map, FGD can be applied to various detectors.
arXiv Detail & Related papers (2021-11-23T13:04:40Z) - Distilling Object Detectors via Decoupled Features [69.62967325617632]
We present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector.
Experiments on various detectors with different backbones show that the proposed DeFeat is able to surpass the state-of-the-art distillation methods for object detection.
arXiv Detail & Related papers (2021-03-26T13:58:49Z) - General Instance Distillation for Object Detection [12.720908566642812]
RetinaNet with ResNet-50 achieves 39.1% in mAP with GID on dataset, which surpasses the baseline 36.2% by 2.9%, and even better than the ResNet-101 based teacher model with 38.1% AP.
arXiv Detail & Related papers (2021-03-03T11:41:26Z) - Why distillation helps: a statistical perspective [69.90148901064747]
Knowledge distillation is a technique for improving the performance of a simple "student" model.
While this simple approach has proven widely effective, a basic question remains unresolved: why does distillation help?
We show how distillation complements existing negative mining techniques for extreme multiclass retrieval.
arXiv Detail & Related papers (2020-05-21T01:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.