Breaking Self-Attention Failure: Rethinking Query Initialization for Infrared Small Target Detection
- URL: http://arxiv.org/abs/2601.02837v1
- Date: Tue, 06 Jan 2026 09:14:01 GMT
- Title: Breaking Self-Attention Failure: Rethinking Query Initialization for Infrared Small Target Detection
- Authors: Yuteng Liu, Duanni Meng, Maoxun Yuan, Xingxing Wei,
- Abstract summary: Infrared small target detection (IRSTD) faces significant challenges due to the low signal-to-noise ratio (SNR), small target size, and complex cluttered backgrounds.<n>Recent DETR-based detectors exhibit notable performance degradation on IRSTD.
- Score: 22.128797773091403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Infrared small target detection (IRSTD) faces significant challenges due to the low signal-to-noise ratio (SNR), small target size, and complex cluttered backgrounds. Although recent DETR-based detectors benefit from global context modeling, they exhibit notable performance degradation on IRSTD. We revisit this phenomenon and reveal that the target-relevant embeddings of IRST are inevitably overwhelmed by dominant background features due to the self-attention mechanism, leading to unreliable query initialization and inaccurate target localization. To address this issue, we propose SEF-DETR, a novel framework that refines query initialization for IRSTD. Specifically, SEF-DETR consists of three components: Frequency-guided Patch Screening (FPS), Dynamic Embedding Enhancement (DEE), and Reliability-Consistency-aware Fusion (RCF). The FPS module leverages the Fourier spectrum of local patches to construct a target-relevant density map, suppressing background-dominated features. DEE strengthens multi-scale representations in a target-aware manner, while RCF further refines object queries by enforcing spatial-frequency consistency and reliability. Extensive experiments on three public IRSTD datasets demonstrate that SEF-DETR achieves superior detection performance compared to state-of-the-art methods, delivering a robust and efficient solution for infrared small target detection task.
Related papers
- SPIRIT: Adapting Vision Foundation Models for Unified Single- and Multi-Frame Infrared Small Target Detection [18.86422994684341]
Infrared small target detection (IRSTD) is crucial for surveillance and early-warning, with deployments spanning both single-frame analysis and video-mode tracking.<n>We propose SPIRIT, a unified and VFM-compatible framework that adapts VFMs to IRSTD via lightweight physics-informed plug-ins.
arXiv Detail & Related papers (2026-02-02T09:15:29Z) - Source-Free Object Detection with Detection Transformer [59.33653163035064]
Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data.<n>Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR)<n>In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs.
arXiv Detail & Related papers (2025-10-13T07:35:04Z) - Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z) - NS-FPN: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective [12.765772056631198]
Infrared small target detection and segmentation (IRS TDS) is a critical yet challenging task in defense and civilian applications.<n>Recent CNN-based methods have achieved promising target perception results, but they only focus on enhancing feature representation to offset the impact of noise.<n>We propose a novel noise-suppression feature pyramid network (NS-FPN), which integrates a low-frequency guided feature purification (LFP) module and a spiral-aware feature sampling (SFS) module.
arXiv Detail & Related papers (2025-08-09T08:17:37Z) - High-Frequency Semantics and Geometric Priors for End-to-End Detection Transformers in Challenging UAV Imagery [6.902247657565531]
We introduce HEDS-DETR, a holistically enhanced real-time Detection Transformer tailored for aerial scenes.<n>First, we propose a novel High-Frequency Enhanced Semantics Network (HFESNet) backbone, which yields highly discriminative features.<n>Second, our Efficient Small Object Pyramid (ESOP) counteracts information loss by efficiently fusing high-resolution features.<n>Third, we enhance decoder stability and localization precision with two synergistic components.
arXiv Detail & Related papers (2025-07-01T14:56:56Z) - Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better [63.567886330598945]
Infrared small target (IRST) detection is challenging in simultaneously achieving precise, universal, robust and efficient performance.<n>Current learning-based methods attempt to leverage more" information from both the spatial and the short-term temporal domains.<n>We propose an efficient deep temporal probe network (DeepPro) that only performs calculations in the time dimension for IRST detection.
arXiv Detail & Related papers (2025-06-15T08:19:32Z) - It's Not the Target, It's the Background: Rethinking Infrared Small Target Detection via Deep Patch-Free Low-Rank Representations [5.326302374594885]
In this paper, we propose a novel end-to-end IRSTD framework, termed LRRNet.<n>Inspired by the physical compressibility of cluttered scenes, our approach adopts a compression-reconstruction-subtraction paradigm.<n>Experiments on multiple public datasets demonstrate that LRRNet outperforms 38 state-of-the-art methods in terms of detection accuracy, robustness, and computational efficiency.
arXiv Detail & Related papers (2025-06-12T07:24:45Z) - DISTA-Net: Dynamic Closely-Spaced Infrared Small Target Unmixing [55.366556355538954]
We propose the Dynamic Iterative Shrinkage Thresholding Network (DISTA-Net), which reconceptualizes traditional sparse reconstruction within a dynamic framework.<n>DISTA-Net is the first deep learning model designed specifically for the unmixing of closely-spaced infrared small targets.<n>We have established the first open-source ecosystem to foster further research in this field.
arXiv Detail & Related papers (2025-05-25T13:52:00Z) - STARS: Sparse Learning Correlation Filter with Spatio-temporal Regularization and Super-resolution Reconstruction for Thermal Infrared Target Tracking [8.52497147463548]
Low resolution of temporal images, along with tracking interference, limits perfor-mance of TIR trackers.<n>We introduce a novel sparse learning-based tracker that incorporates superresolution reconstruction.<n>To the best of our knowledge, STARS is the first to integrate super-resolution methods within a sparse learning-based framework.
arXiv Detail & Related papers (2025-04-20T04:49:52Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.