Learning Dynamic Local Context Representations for Infrared Small Target Detection
- URL: http://arxiv.org/abs/2412.17401v1
- Date: Mon, 23 Dec 2024 09:06:27 GMT
- Title: Learning Dynamic Local Context Representations for Infrared Small Target Detection
- Authors: Guoyi Zhang, Guangsheng Xu, Han Wang, Siyang Chen, Yunxiao Shan, Xiaohu Zhang,
- Abstract summary: Infrared small target detection (ISTD) is challenging due to complex backgrounds, low signal-to-clutter ratios, and varying target sizes and shapes.
We propose LCRNet, a novel method that learns dynamic local context representations for ISTD.
With only 1.65M parameters, LCRNet achieves state-of-the-art (SOTA) performance.
- Score: 5.897465234102489
- License:
- Abstract: Infrared small target detection (ISTD) is challenging due to complex backgrounds, low signal-to-clutter ratios, and varying target sizes and shapes. Effective detection relies on capturing local contextual information at the appropriate scale. However, small-kernel CNNs have limited receptive fields, leading to false alarms, while transformer models, with global receptive fields, often treat small targets as noise, resulting in miss-detections. Hybrid models struggle to bridge the semantic gap between CNNs and transformers, causing high complexity.To address these challenges, we propose LCRNet, a novel method that learns dynamic local context representations for ISTD. The model consists of three components: (1) C2FBlock, inspired by PDE solvers, for efficient small target information capture; (2) DLC-Attention, a large-kernel attention mechanism that dynamically builds context and reduces feature redundancy; and (3) HLKConv, a hierarchical convolution operator based on large-kernel decomposition that preserves sparsity and mitigates the drawbacks of dilated convolutions. Despite its simplicity, with only 1.65M parameters, LCRNet achieves state-of-the-art (SOTA) performance.Experiments on multiple datasets, comparing LCRNet with 33 SOTA methods, demonstrate its superior performance and efficiency.
Related papers
- LR-Net: A Lightweight and Robust Network for Infrared Small Target Detection [2.6617665093172445]
We propose an innovative lightweight and robust network (LR-Net)
LR-Net abandons the complex structure and achieves an effective balance between detection accuracy and resource consumption.
We achieve 3rd place in the "ICPR 2024 Resource-Limited Infrared Small Target Detection Challenge Track 2: Lightweight Infrared Small Target Detection"
arXiv Detail & Related papers (2024-08-05T18:57:33Z) - ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection [65.59969454655996]
We propose an efficient change detection framework, ELGC-Net, which leverages rich contextual information to precisely estimate change regions.
Our proposed ELGC-Net sets a new state-of-the-art performance in remote sensing change detection benchmarks.
We also introduce ELGC-Net-LW, a lighter variant with significantly reduced computational complexity, suitable for resource-constrained settings.
arXiv Detail & Related papers (2024-03-26T17:46:25Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - Adaptive Sparse Convolutional Networks with Global Context Enhancement
for Faster Object Detection on Drone Images [26.51970603200391]
This paper investigates optimizing the detection head based on the sparse convolution.
It suffers from inadequate integration of contextual information of tiny objects.
We propose a novel global context-enhanced adaptive sparse convolutional network.
arXiv Detail & Related papers (2023-03-25T14:42:50Z) - ABC: Attention with Bilinear Correlation for Infrared Small Target
Detection [4.7379300868029395]
CNN based deep learning methods are not effective at segmenting infrared small target (IRST)
We propose a new model called attention with bilinear correlation (ABC)
ABC is based on the transformer architecture and includes a convolution linear fusion transformer (CLFT) module with a novel attention mechanism for feature extraction and fusion.
arXiv Detail & Related papers (2023-03-18T03:47:06Z) - a novel attention-based network for fast salient object detection [14.246237737452105]
In the current salient object detection network, the most popular method is using U-shape structure.
We propose a new deep convolution network architecture with three contributions.
Results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy.
arXiv Detail & Related papers (2021-12-20T12:30:20Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.