Related papers: Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework

Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework

URL: http://arxiv.org/abs/2409.12448v3
Date: Thu, 20 Feb 2025 08:14:34 GMT
Title: Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework
Authors: Xinyi Ying, Li Liu, Zaipin Lin, Yangsi Shi, Yingqian Wang, Ruojing Li, Xu Cao, Boyang Li, Shilin Zhou, Wei An,
Abstract summary: IRSatVideo-LEO is a semi-simulated dataset with synthesized satellite motion, target appearance, trajectory and intensity.<n>RFR is proposed to be equipped with existing powerful CNN-based methods for long-term temporal dependency exploitation.
Score: 28.777999462705516
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Multi-frame infrared small target (MIRST) detection in satellite videos is a long-standing, fundamental yet challenging task for decades, and the challenges can be summarized as: First, extremely small target size, highly complex clutters & noises, various satellite motions result in limited feature representation, high false alarms, and difficult motion analyses. Second, the lack of large-scale public available MIRST dataset in satellite videos greatly hinders the algorithm development. To address the aforementioned challenges, in this paper, we first build a large-scale dataset for MIRST detection in satellite videos (namely IRSatVideo-LEO), and then develop a recurrent feature refinement (RFR) framework as the baseline method. Specifically, IRSatVideo-LEO is a semi-simulated dataset with synthesized satellite motion, target appearance, trajectory and intensity, which can provide a standard toolbox for satellite video generation and a reliable evaluation platform to facilitate the algorithm development. For baseline method, RFR is proposed to be equipped with existing powerful CNN-based methods for long-term temporal dependency exploitation and integrated motion compensation & MIRST detection. Specifically, a pyramid deformable alignment (PDA) module and a temporal-spatial-frequency modulation (TSFM) module are proposed to achieve effective and efficient feature alignment, propagation, aggregation and refinement. Extensive experiments have been conducted to demonstrate the effectiveness and superiority of our scheme. The comparative results show that ResUNet equipped with RFR outperforms the state-of-the-art MIRST detection methods. Dataset and code are released at https://github.com/XinyiYing/RFR.

Related papers

STAR: A Benchmark for Astronomical Star Fields Super-Resolution [51.79340280382437]
We propose STAR, a large-scale astronomical SR dataset containing 54,738 flux-consistent star field image pairs.<n>We propose a Flux-Invariant Super Resolution (FISR) model that could accurately infer the flux-consistent high-resolution images from input photometry.
arXiv Detail & Related papers (2025-07-22T09:28:28Z)
SeqCSIST: Sequential Closely-Spaced Infrared Small Target Unmixing [7.09321729956876]
We propose a novel task, Sequential CSIST Unmixing, for detecting all targets in the form of sub-pixel localization from a highly dense CSIST group.<n>We contribute an open-source ecosystem, including SeqCSIST, a sequential benchmark dataset, and a toolkit that provides objective evaluation metrics for this special task.<n>Our method outperforms the state-of-the-art approaches with mean Average Precision (mAP) metric improved by 5.3%.
arXiv Detail & Related papers (2025-07-13T09:59:48Z)
Efficient SAR Vessel Detection for FPGA-Based On-Satellite Sensing [0.0]
We develop and deploy a new efficient and highly performant SAR vessel detection model, using a customised YOLOv8 architecture specifically optimized for FPGA-based processing within common satellite power constraints (10W)<n>Our model has detection and classification performance only 2% and 3% lower than values from state-of-the-art GPU-based models, despite being two to three orders of magnitude smaller in size.<n>This work demonstrates small yet highly performant ML models for time-critical SAR analysis, paving the way for more autonomous, responsive, and scalable Earth observation systems.
arXiv Detail & Related papers (2025-07-07T10:03:31Z)
Metadata, Wavelet, and Time Aware Diffusion Models for Satellite Image Super Resolution [4.307648859471193]
MWT-Diff is an innovative framework for satellite image super-resolution (SR)<n>At the core of the framework is a novel metadata-, wavelet-, and time-aware encoder (MWT-Encoder)<n>The embedded feature representations steer the hierarchical diffusion dynamics, through which the model progressively reconstructs high-resolution satellite imagery from low-resolution inputs.
arXiv Detail & Related papers (2025-06-30T07:19:50Z)
Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better [63.567886330598945]
Infrared small target (IRST) detection is challenging in simultaneously achieving precise, universal, robust and efficient performance.<n>Current learning-based methods attempt to leverage more" information from both the spatial and the short-term temporal domains.<n>We propose an efficient deep temporal probe network (DeepPro) that only performs calculations in the time dimension for IRST detection.
arXiv Detail & Related papers (2025-06-15T08:19:32Z)
YOLO-MST: Multiscale deep learning method for infrared small target detection based on super-resolution and YOLO [0.18641315013048293]
This paper proposes a deep-learning infrared small target detection method that combines image super-resolution technology with multi-scale observation. The mAP@0.5 detection rates of this method on two public datasets, SIRST and IRIS, reached 96.4% and 99.5% respectively.
arXiv Detail & Related papers (2024-12-27T18:43:56Z)
Rapid Distributed Fine-tuning of a Segmentation Model Onboard Satellites [13.235981880457125]
This study presents a proof-of-concept using MobileSAM, a lightweight, pre-trained segmentation model, onboard Unibap iX10-100 satellite hardware. Our research investigates the potential of fine-tuning MobileSAM in a decentralised way onboard multiple satellites in rapid response to a disaster.
arXiv Detail & Related papers (2024-11-26T19:11:36Z)
Enhancing Maritime Situational Awareness through End-to-End Onboard Raw Data Analysis [4.441792803766689]
This research presents a framework addressing the strict bandwidth, energy, and latency constraints of small satellites. It investigates the application of deep learning techniques for direct ship detection and classification from raw satellite imagery. By simplifying the onboard processing chain, our approach facilitates direct analyses without requiring computationally intensive steps such as calibration and ortho-rectification.
arXiv Detail & Related papers (2024-11-05T18:38:42Z)
Single-Point Supervised High-Resolution Dynamic Network for Infrared Small Target Detection [7.0456782736205685]
We propose a single-point supervised high-resolution dynamic network (SSHD-Net) It achieves state-of-the-art (SOTA) detection performance using only single-point supervision. Experiments on the publicly available datasets NUDT-SIRST and IRSTD-1k demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-08-04T09:44:47Z)
IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection [55.554484379021524]
Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images. We propose the IRSAM model for IRSTD, which improves SAM's encoder-decoder architecture to learn better feature representation of infrared small objects.
arXiv Detail & Related papers (2024-07-10T10:17:57Z)
FlightScope: A Deep Comprehensive Review of Aircraft Detection Algorithms in Satellite Imagery [2.9687381456164004]
This paper critically evaluates and compares a suite of advanced object detection algorithms customized for the task of identifying aircraft within satellite imagery. This research encompasses an array of methodologies including YOLO versions 5 and 8, Faster RCNN, CenterNet, RetinaNet, RTMDet, and DETR, all trained from scratch. YOLOv5 emerges as a robust solution for aerial object detection, underlining its importance through superior mean average precision, Recall, and Intersection over Union scores.
arXiv Detail & Related papers (2024-04-03T17:24:27Z)
SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
Diffusion Models for Interferometric Satellite Aperture Radar [73.01013149014865]
Probabilistic Diffusion Models (PDMs) have recently emerged as a very promising class of generative models. Here, we leverage PDMs to generate several radar-based satellite image datasets. We show that PDMs succeed in generating images with complex and realistic structures, but that sampling time remains an issue.
arXiv Detail & Related papers (2023-08-31T16:26:17Z)
STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems. The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions. Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z)
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications. We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation. Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z)
Wireless Sensing With Deep Spectrogram Network and Primitive Based Autoregressive Hybrid Channel Model [20.670058030653458]
Human motion recognition (HMR) based on wireless sensing is a low-cost technique for scene understanding. Current HMR systems adopt support vector machines (SVMs) and convolutional neural networks (CNNs) to classify radar signals. This paper proposes a deep spectrogram network (DSN) by leveraging the residual mapping technique to enhance the HMR performance.
arXiv Detail & Related papers (2021-04-21T06:33:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.