Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection
- URL: http://arxiv.org/abs/2512.04413v1
- Date: Thu, 04 Dec 2025 03:18:42 GMT
- Title: Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection
- Authors: Xiangyi Gao, Danpei Zhao, Bo Yuan, Wentao Li,
- Abstract summary: We propose an architecture-agnostic distillation method named Dual-Stream Spectral Decoupling Distillation (DS2D2) for universal remote sensing object detection tasks.<n>Specifically, DS2D2 integrates explicit and implicit distillation grounded in spectral decomposition.<n>We show implicit knowledge hidden in subtle student-teacher feature discrepancies, which significantly influence predictions when activated by detection heads.
- Score: 13.198666977664109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation is an effective and hardware-friendly method, which plays a key role in lightweighting remote sensing object detection. However, existing distillation methods often encounter the issue of mixed features in remote sensing images (RSIs), and neglect the discrepancies caused by subtle feature variations, leading to entangled knowledge confusion. To address these challenges, we propose an architecture-agnostic distillation method named Dual-Stream Spectral Decoupling Distillation (DS2D2) for universal remote sensing object detection tasks. Specifically, DS2D2 integrates explicit and implicit distillation grounded in spectral decomposition. Firstly, the first-order wavelet transform is applied for spectral decomposition to preserve the critical spatial characteristics of RSIs. Leveraging this spatial preservation, a Density-Independent Scale Weight (DISW) is designed to address the challenges of dense and small object detection common in RSIs. Secondly, we show implicit knowledge hidden in subtle student-teacher feature discrepancies, which significantly influence predictions when activated by detection heads. This implicit knowledge is extracted via full-frequency and high-frequency amplifiers, which map feature differences to prediction deviations. Extensive experiments on DIOR and DOTA datasets validate the effectiveness of the proposed method. Specifically, on DIOR dataset, DS2D2 achieves improvements of 4.2% in AP50 for RetinaNet and 3.8% in AP50 for Faster R-CNN, outperforming existing distillation approaches. The source code will be available at https://github.com/PolarAid/DS2D2.
Related papers
- DenoDet V2: Phase-Amplitude Cross Denoising for SAR Object Detection [49.9059941674531]
We propose DenoDet V2, which exploits the complementary nature of amplitude and phase information through a band-wise mutual modulation mechanism.<n>DenoDet V2 achieves a significant 0.8% improvement on SARDet-100K dataset compared to DenoDet V1, while reducing the model complexity by half.
arXiv Detail & Related papers (2025-08-12T23:24:20Z) - Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection [36.478530086163744]
We propose a novel Mutually optimizing pre-training framework for remote sensing object Detection, dubbed as MutDet.
MutDet fuses the object embeddings and detector features bidirectionally in the last encoder layer, enhancing their information interaction.
Experiments on various settings show new state-of-the-art transfer performance.
arXiv Detail & Related papers (2024-07-13T15:28:15Z) - Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing [0.7305342793164903]
We propose a model simplification method for two-stage object detectors.
Our method reduces computation costs upto 61.2% with the accuracy loss within 2.1% on the DOTAv1.5 dataset.
arXiv Detail & Related papers (2024-04-11T00:45:10Z) - DMSSN: Distilled Mixed Spectral-Spatial Network for Hyperspectral Salient Object Detection [12.823338405434244]
Hyperspectral salient object detection (HSOD) has exhibited remarkable promise across various applications.
Previous methods insufficiently harness the inherent distinctive attributes of hyperspectral images (HSIs) during the feature extraction process.
We propose Distilled Mixed Spectral-Spatial Network (DMSSN), comprising a Distilled Spectral-Spatial Transformer (MSST)
We have created a large-scale HSOD dataset, HSOD-BIT, to tackle the issue of data scarcity in this field.
arXiv Detail & Related papers (2024-03-31T14:04:57Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - Efficient Object Detection in Optical Remote Sensing Imagery via
Attention-based Feature Distillation [29.821082433621868]
We propose Attention-based Feature Distillation (AFD) for object detection.
We introduce a multi-instance attention mechanism that effectively distinguishes between background and foreground elements.
AFD attains the performance of other state-of-the-art models while being efficient.
arXiv Detail & Related papers (2023-10-28T11:15:37Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss.
Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU.
The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z) - GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object
Detector [11.161639542268015]
We propose sensor-aware multi-modal fusion strategies for 2D object detection in harsh-lighting conditions.
Our network learns to estimate the measurement reliability of each sensor modality in the form of scalar weights and masks.
We show that the proposed strategies out-perform the existing state-of-the-art methods on the FLIR-Thermal dataset.
arXiv Detail & Related papers (2021-02-24T14:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.