Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
- URL: http://arxiv.org/abs/2504.16487v1
- Date: Wed, 23 Apr 2025 07:58:15 GMT
- Title: Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
- Authors: Yahao Lu, Yuehui Li, Xingyuan Guo, Shuai Yuan, Yukai Shi, Liang Lin,
- Abstract summary: Infrared small target detection (ISTD) is highly sensitive to sensor type, observation conditions, and the intrinsic properties of the target.<n>This paper introduces an ISTD framework enhanced by domain adaptation.<n>We also develop a dedicated infrared small target dataset, RealScene-ISTD.
- Score: 46.60262602072635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Infrared small target detection (ISTD) is highly sensitive to sensor type, observation conditions, and the intrinsic properties of the target. These factors can introduce substantial variations in the distribution of acquired infrared image data, a phenomenon known as domain shift. Such distribution discrepancies significantly hinder the generalization capability of ISTD models across diverse scenarios. To tackle this challenge, this paper introduces an ISTD framework enhanced by domain adaptation. To alleviate distribution shift between datasets and achieve cross-sample alignment, we introduce Cross-view Channel Alignment (CCA). Additionally, we propose the Cross-view Top-K Fusion strategy, which integrates target information with diverse background features, enhancing the model' s ability to extract critical data characteristics. To further mitigate the impact of noise on ISTD, we develop a Noise-guided Representation learning strategy. This approach enables the model to learn more noise-resistant feature representations, to improve its generalization capability across diverse noisy domains. Finally, we develop a dedicated infrared small target dataset, RealScene-ISTD. Compared to state-of-the-art methods, our approach demonstrates superior performance in terms of detection probability (Pd), false alarm rate (Fa), and intersection over union (IoU). The code is available at: https://github.com/luy0222/RealScene-ISTD.
Related papers
- SC3EF: A Joint Self-Correlation and Cross-Correspondence Estimation Framework for Visible and Thermal Image Registration [3.4668188256000576]
accurate visible and thermal (RGB-T) image registration poses a significant challenge.<n>We present a novel joint Self-Correlation and Cross-Correspondence Estimation Framework (SC3EF)<n>We show the effectiveness of our proposed method, outperforming the current state-of-the-art (SOTA) methods on representative RGB-T datasets.
arXiv Detail & Related papers (2025-04-17T11:54:12Z) - MSCA-Net:Multi-Scale Context Aggregation Network for Infrared Small Target Detection [0.0]
This paper proposes a novel network architecture named MSCA-Net, which integrates three key components.<n>MSEDA employs a multi-scale feature fusion attention mechanism to adaptively aggregate information across different scales.<n>PCBAM captures the correlation between global and local features through a correlation matrix-based strategy.
arXiv Detail & Related papers (2025-03-21T14:42:31Z) - Spectrum-oriented Point-supervised Saliency Detector for Hyperspectral Images [13.79887292039637]
We introduce point supervision into Hyperspectral salient object detection (HSOD)
We incorporate Spectral Saliency, derived from conventional HSOD methods, as a pivotal spectral representation within the framework.
We propose a novel pipeline, specifically designed for HSIs, to generate pseudo-labels, effectively mitigating the performance decline associated with point supervision strategy.
arXiv Detail & Related papers (2024-12-24T02:52:43Z) - Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study.<n>Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets.<n>We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - SatDM: Synthesizing Realistic Satellite Image with Semantic Layout
Conditioning using Diffusion Models [0.0]
Denoising Diffusion Probabilistic Models (DDPMs) have demonstrated significant promise in synthesizing realistic images from semantic layouts.
In this paper, a conditional DDPM model capable of taking a semantic map and generating high-quality, diverse, and correspondingly accurate satellite images is implemented.
The effectiveness of our proposed model is validated using a meticulously labeled dataset introduced within the context of this study.
arXiv Detail & Related papers (2023-09-28T19:39:13Z) - Hierarchical Disentanglement-Alignment Network for Robust SAR Vehicle
Recognition [18.38295403066007]
HDANet integrates feature disentanglement and alignment into a unified framework.
The proposed method demonstrates impressive robustness across nine operating conditions in the MSTAR dataset.
arXiv Detail & Related papers (2023-04-07T09:11:29Z) - DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors
for Change Detection [31.125812018296127]
We introduce a novel approach for change detection by pre-training a Deno Diffusionising Probabilistic Model (DDPM)
DDPM learns the training data distribution by gradually converting training images into a Gaussian distribution using a Markov chain.
During inference (i.e., sampling), they can generate a diverse set of samples closer to the training distribution.
Experiments conducted on the LEVIR-CD, WHU-CD, DSIFN-CD, and CDD datasets demonstrate that the proposed DDPM-CD method significantly outperforms the existing change detection methods in terms of F1 score, I
arXiv Detail & Related papers (2022-06-23T17:58:29Z) - Learning Selective Mutual Attention and Contrast for RGB-D Saliency
Detection [145.4919781325014]
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection.
Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods.
We propose a novel mutual attention model by fusing attention and contexts from different modalities.
arXiv Detail & Related papers (2020-10-12T08:50:10Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.