Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds
- URL: http://arxiv.org/abs/2109.14379v1
- Date: Wed, 29 Sep 2021 12:23:41 GMT
- Title: Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds
- Authors: Fangcen Liu, Chenqiang Gao, Fang Chen, Deyu Meng, Wangmeng Zuo, Xinbo
Gao
- Abstract summary: We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
- Score: 155.388487263872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The infrared small-dim target detection is one of the key techniques in the
infrared search and tracking system. Since the local regions which similar to
infrared small-dim targets spread over the whole background, exploring the
interaction information amongst image features in large-range dependencies to
mine the difference between the target and background is crucial for robust
detection. However, existing deep learning-based methods are limited by the
locality of convolutional neural networks, which impairs the ability to capture
large-range dependencies. To this end, we propose a new infrared small-dim
target detection method with the transformer. We adopt the self-attention
mechanism of the transformer to learn the interaction information of image
features in a larger range. Additionally, we design a feature enhancement
module to learn more features of small-dim targets. After that, we adopt a
decoder with the U-Net-like skip connection operation to get the detection
result. Extensive experiments on two public datasets show the obvious
superiority of the proposed method over state-of-the-art methods.
Related papers
- Multi-Scale Direction-Aware Network for Infrared Small Target Detection [2.661766509317245]
Infrared small target detection faces the problem that it is difficult to effectively separate the background and the target.
We propose a multi-scale direction-aware network (MSDA-Net) to integrate the high-frequency directional features of infrared small targets.
MSDA-Net achieves state-of-the-art (SOTA) results on the public NUDT-SIRST, SIRST and IRSTD-1k datasets.
arXiv Detail & Related papers (2024-06-04T07:23:09Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - EFLNet: Enhancing Feature Learning for Infrared Small Target Detection [20.546186772828555]
Single-frame infrared small target detection is considered to be a challenging task.
Due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target.
We propose an enhancing feature learning network (EFLNet) to address these problems.
arXiv Detail & Related papers (2023-07-27T09:23:22Z) - ABC: Attention with Bilinear Correlation for Infrared Small Target
Detection [4.7379300868029395]
CNN based deep learning methods are not effective at segmenting infrared small target (IRST)
We propose a new model called attention with bilinear correlation (ABC)
ABC is based on the transformer architecture and includes a convolution linear fusion transformer (CLFT) module with a novel attention mechanism for feature extraction and fusion.
arXiv Detail & Related papers (2023-03-18T03:47:06Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Local Motion and Contrast Priors Driven Deep Network for Infrared Small
Target Super-Resolution [24.131639832686083]
Infrared small target super-resolution (SR) aims to recover reliable and high-resolution image with high-contrast targets from its low-resolution counterparts.
We propose the first infrared small target SR method named local motion and contrast prior deep network (MoCoPnet)
MoCoPnet integrates domain knowledge of infrared small target into deep deep network, which can mitigate feature scarcity of small infrared targets.
arXiv Detail & Related papers (2022-01-04T07:20:46Z) - Perception-aware Multi-sensor Fusion for 3D LiDAR Semantic Segmentation [59.42262859654698]
3D semantic segmentation is important in scene understanding for many applications, such as auto-driving and robotics.
Existing fusion-based methods may not achieve promising performance due to vast difference between two modalities.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z) - TBC-Net: A real-time detector for infrared small target detection using
semantic constraint [18.24737906712967]
Deep learning is rarely used in infrared small target detection due to the difficulty in learning small target features.
We propose a novel lightweight convolutional neural network TBC-Net for infrared small target detection.
arXiv Detail & Related papers (2019-12-27T05:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.