Related papers: Distilling Object Detectors via Decoupled Features

Distilling Object Detectors via Decoupled Features

URL: http://arxiv.org/abs/2103.14475v1
Date: Fri, 26 Mar 2021 13:58:49 GMT
Title: Distilling Object Detectors via Decoupled Features
Authors: Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu and Chang Xu
Abstract summary: We present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector. Experiments on various detectors with different backbones show that the proposed DeFeat is able to surpass the state-of-the-art distillation methods for object detection.
Score: 69.62967325617632
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge distillation is a widely used paradigm for inheriting information from a complicated teacher network to a compact student network and maintaining the strong performance. Different from image classification, object detectors are much more sophisticated with multiple loss functions in which features that semantic information rely on are tangled. In this paper, we point out that the information of features derived from regions excluding objects are also essential for distilling the student detector, which is usually ignored in existing approaches. In addition, we elucidate that features from different regions should be assigned with different importance during distillation. To this end, we present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector. Specifically, two levels of decoupled features will be processed for embedding useful information into the student, i.e., decoupled features from neck and decoupled proposals from classification head. Extensive experiments on various detectors with different backbones show that the proposed DeFeat is able to surpass the state-of-the-art distillation methods for object detection. For example, DeFeat improves ResNet50 based Faster R-CNN from 37.4% to 40.9% mAP, and improves ResNet50 based RetinaNet from 36.5% to 39.7% mAP on COCO benchmark. Our implementation is available at https://github.com/ggjy/DeFeat.pytorch.

Related papers

Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation [56.053397775016755]
We propose a sequential approach to knowledge distillation that progressively transfers the knowledge of a set of teacher detectors to a given lightweight student. To the best of our knowledge, we are the first to successfully distill knowledge from Transformer-based teacher detectors to convolution-based students.
arXiv Detail & Related papers (2023-08-17T17:17:08Z)
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model. We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions. Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z)
PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient [18.782520279344553]
This paper empirically find that better FPN features from a heterogeneous teacher detector can help the student. We propose to imitate features with Pearson Correlation Coefficient to focus on the relational information from the teacher. Our method consistently outperforms the existing detection KD methods and works for both homogeneous and heterogeneous student-teacher pairs.
arXiv Detail & Related papers (2022-07-05T13:37:34Z)
Focal and Global Knowledge Distillation for Detectors [23.315649744061982]
We propose Focal and Global Distillation (FGD) for object detection. FGD separates the foreground and background, forcing the student to focus on the teacher's critical pixels and channels. As our method only needs to calculate the loss on the feature map, FGD can be applied to various detectors.
arXiv Detail & Related papers (2021-11-23T13:04:40Z)
Distilling Object Detectors with Feature Richness [13.187669828065554]
Large-scale deep models have achieved great success, but the huge computational complexity and massive storage requirements make it a great challenge to deploy them in resource-limited devices. As a model compression and acceleration method, knowledge distillation effectively improves the performance of small models by transferring the dark knowledge from the teacher detector. We propose a novel Feature-Richness Score (FRS) method to choose important features that improve generalized detectability during distilling.
arXiv Detail & Related papers (2021-11-01T03:16:06Z)
Instance-Conditional Knowledge Distillation for Object Detection [59.56780046291835]
We propose an instance-conditional distillation framework to find desired knowledge. We use observed instances as condition information and formulate the retrieval process as an instance-conditional decoding process.
arXiv Detail & Related papers (2021-10-25T08:23:29Z)
Deep Structured Instance Graph for Distilling Object Detectors [82.16270736573176]
We present a simple knowledge structure to exploit and encode information inside the detection system to facilitate detector knowledge distillation. We achieve new state-of-the-art results on the challenging COCO object detection task with diverse student-teacher pairs on both one- and two-stage detectors.
arXiv Detail & Related papers (2021-09-27T08:26:00Z)
G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation [49.421099172544196]
We propose a novel semantic-guided feature imitation technique, which automatically performs soft matching between feature pairs across all pyramid levels. We also introduce contrastive distillation to effectively capture the information encoded in the relationship between different feature regions. Our method consistently outperforms the existing detection KD techniques, and works when (1) components in the framework are used separately and in conjunction.
arXiv Detail & Related papers (2021-08-17T07:44:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.