Related papers: Hands-on Guidance for Distilling Object Detectors

Hands-on Guidance for Distilling Object Detectors

URL: http://arxiv.org/abs/2103.14337v1
Date: Fri, 26 Mar 2021 09:00:23 GMT
Title: Hands-on Guidance for Distilling Object Detectors
Authors: Yangyang Qin, Hefei Ling, Zhenghai He, Yuxuan Shi, Lei Wu
Abstract summary: Our method, called Hands-on Guidance Distillation, distills the latent knowledge of all stage features for imposing more comprehensive supervision. We conduct extensive evaluations with different distillation configurations over VOC and COCO datasets, which show better performance on accuracy and speed trade-offs.
Score: 11.856477599768773
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge distillation can lead to deploy-friendly networks against the plagued computational complexity problem, but previous methods neglect the feature hierarchy in detectors. Motivated by this, we propose a general framework for detection distillation. Our method, called Hands-on Guidance Distillation, distills the latent knowledge of all stage features for imposing more comprehensive supervision, and focuses on the essence simultaneously for promoting more intense knowledge absorption. Specifically, a series of novel mechanisms are designed elaborately, including correspondence establishment for consistency, hands-on imitation loss measure and re-weighted optimization from both micro and macro perspectives. We conduct extensive evaluations with different distillation configurations over VOC and COCO datasets, which show better performance on accuracy and speed trade-offs. Meanwhile, feasibility experiments on different structural networks further prove the robustness of our HGD.

Related papers

Leave It to the Experts: Detecting Knowledge Distillation via MoE Expert Signatures [57.98221536489363]
Knowledge Distillation (KD) accelerates training of large language models (LLMs) but poses intellectual property protection and diversity risks.<n>We present a KD detection framework effective in both white-box and black-box settings by exploiting an overlooked signal: the transfer of MoE "structural habits"<n>Our approach analyzes how different experts specialize and collaborate across various inputs, creating distinctive fingerprints that persist through the distillation process.
arXiv Detail & Related papers (2025-10-19T19:15:08Z)
HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs [11.40571767579383]
We present the solution for the two tracks of Responsible AI challenge.<n>We propose a progressive hybrid knowledge distillation framework termed HKD4VLM.<n>Specifically, the framework can be decomposed into Pyramid-like Progressive Online Distillation and Ternary-Coupled Refinement Distillation.
arXiv Detail & Related papers (2025-06-16T02:03:41Z)
SAMKD: Spatial-aware Adaptive Masking Knowledge Distillation for Object Detection [4.33169417430713]
We propose a spatial-aware Adaptive Masking Knowledge Distillation framework for accurate object detection. Our method improves the student network from 35.3% to 38.8% mAP, outperforming state-of-the-art distillation methods.
arXiv Detail & Related papers (2025-01-13T07:26:37Z)
Leveraging Mixture of Experts for Improved Speech Deepfake Detection [53.69740463004446]
Speech deepfakes pose a significant threat to personal security and content authenticity. We introduce a novel approach for enhancing speech deepfake detection performance using a Mixture of Experts architecture.
arXiv Detail & Related papers (2024-09-24T13:24:03Z)
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection [11.250490586786878]
Video anomaly detection aims to develop automated models capable of identifying abnormal events in surveillance videos. We show that distilling knowledge from aggregated representations into a relatively simple model achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-06-05T00:44:42Z)
Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities [16.69453837626083]
We propose a Correlation-decoupled Knowledge Distillation (CorrKD) framework for the Multimodal Sentiment Analysis (MSA) task under uncertain missing modalities. We present a sample-level contrastive distillation mechanism that transfers comprehensive knowledge containing cross-sample correlations to reconstruct missing semantics. We design a response-disentangled consistency distillation strategy to optimize the sentiment decision boundaries of the student network.
arXiv Detail & Related papers (2024-04-25T09:35:09Z)
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing [71.47723696190184]
We propose an innovative Bit-mask Robust Contrastive knowledge Distillation (BRCD) method for semantic hashing. BRCD is specifically devised for the distillation of semantic hashing models.
arXiv Detail & Related papers (2024-03-10T03:33:59Z)
SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-world Object Detector [8.956773268679811]
We specialize the VLM model for OWOD tasks by distilling its open-world knowledge into a language-agnostic detector. We observe that the combination of a simple textbfknowledge distillation approach and the automatic pseudo-labeling mechanism in OWOD can achieve better performance for unknown object detection. We propose two benchmarks for evaluating the ability of the open-world detector to detect unknown objects in the open world.
arXiv Detail & Related papers (2023-12-14T04:47:20Z)
Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies. We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z)
Supervision Complexity and its Role in Knowledge Distillation [65.07910515406209]
We study the generalization behavior of a distilled student. The framework highlights a delicate interplay among the teacher's accuracy, the student's margin with respect to the teacher predictions, and the complexity of the teacher predictions. We demonstrate efficacy of online distillation and validate the theoretical findings on a range of image classification benchmarks and model architectures.
arXiv Detail & Related papers (2023-01-28T16:34:47Z)
DETRDistill: A Universal Knowledge Distillation Framework for DETR-families [11.9748352746424]
Transformer-based detectors (DETRs) have attracted great attention due to their sparse training paradigm and the removal of post-processing operations. Knowledge distillation (KD) can be employed to compress the huge model by constructing a universal teacher-student learning framework.
arXiv Detail & Related papers (2022-11-17T13:35:11Z)
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling [52.11242317111469]
We focus on the compression of DETR with knowledge distillation.<n>The main challenge in DETR distillation is the lack of consistent distillation points.<n>We propose the first general knowledge distillation paradigm for DETR with consistent distillation points sampling.
arXiv Detail & Related papers (2022-11-15T11:52:30Z)
Self-Knowledge Distillation via Dropout [0.7883397954991659]
We propose a simple and effective self-knowledge distillation method using a dropout (SD-Dropout) Our method does not require any additional trainable modules, does not rely on data, and requires only simple operations.
arXiv Detail & Related papers (2022-08-11T05:08:55Z)
ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval [54.54667085792404]
We propose a novel distillation method that significantly advances cross-architecture distillation for dual-encoders. Our method 1) introduces a self on-the-fly distillation method that can effectively distill late interaction (i.e., ColBERT) to vanilla dual-encoder, and 2) incorporates a cascade distillation process to further improve the performance with a cross-encoder teacher.
arXiv Detail & Related papers (2022-05-18T18:05:13Z)
Response-based Distillation for Incremental Object Detection [2.337183337110597]
Traditional object detection are ill-equipped for incremental learning. Fine-tuning directly on a well-trained detection model with only new data will leads to catastrophic forgetting. We propose a fully response-based incremental distillation method focusing on learning response from detection bounding boxes and classification predictions.
arXiv Detail & Related papers (2021-10-26T08:07:55Z)
Distilling Image Classifiers in Object Detectors [81.63849985128527]
We study the case of object detection and, instead of following the standard detector-to-detector distillation approach, introduce a classifier-to-detector knowledge transfer framework. In particular, we propose strategies to exploit the classification teacher to improve both the detector's recognition accuracy and localization performance.
arXiv Detail & Related papers (2021-06-09T16:50:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.