Related papers: SFA-UNet: More Attention to Multi-Scale Contrast and Contextual Information in Infrared Small Object Segmentation

SFA-UNet: More Attention to Multi-Scale Contrast and Contextual Information in Infrared Small Object Segmentation

URL: http://arxiv.org/abs/2410.22881v2
Date: Sat, 16 Nov 2024 14:10:33 GMT
Title: SFA-UNet: More Attention to Multi-Scale Contrast and Contextual Information in Infrared Small Object Segmentation
Authors: Imad Ali Shah, Fahad Mumtaz Malik, Muhammad Waqas Ashraf,
Abstract summary: Infrared Small Object (ISOS) remains a major focus due to several challenges. We propose a modified U-Net architecture, named SFA-UNet, by combining Scharr Convolution (SC) and Fast Fourier Convolution (FFC) in addition to vertical and horizontal Attention gates (AG) into UNet. SC helps to learn the foreground-to-background contrast information whereas FFC provide multi-scale contextual information while mitigating the small objects vanishing problem.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computer vision researchers have extensively worked on fundamental infrared visual recognition for the past few decades. Among various approaches, deep learning has emerged as the most promising candidate. However, Infrared Small Object Segmentation (ISOS) remains a major focus due to several challenges including: 1) the lack of effective utilization of local contrast and global contextual information; 2) the potential loss of small objects in deep models; and 3) the struggling to capture fine-grained details and ignore noise. To address these challenges, we propose a modified U-Net architecture, named SFA-UNet, by combining Scharr Convolution (SC) and Fast Fourier Convolution (FFC) in addition to vertical and horizontal Attention gates (AG) into UNet. SFA-UNet utilizes double convolution layers with the addition of SC and FFC in its encoder and decoder layers. SC helps to learn the foreground-to-background contrast information whereas FFC provide multi-scale contextual information while mitigating the small objects vanishing problem. Additionally, the introduction of vertical AGs in encoder layers enhances the model's focus on the targeted object by ignoring irrelevant regions. We evaluated the proposed approach on publicly available, SIRST and IRSTD datasets, and achieved superior performance by an average 0.75% with variance of 0.025 of all combined metrics in multiple runs as compared to the existing state-of-the-art methods

Related papers

DISTA-Net: Dynamic Closely-Spaced Infrared Small Target Unmixing [55.366556355538954]
We propose the Dynamic Iterative Shrinkage Thresholding Network (DISTA-Net), which reconceptualizes traditional sparse reconstruction within a dynamic framework.<n>DISTA-Net is the first deep learning model designed specifically for the unmixing of closely-spaced infrared small targets.<n>We have established the first open-source ecosystem to foster further research in this field.
arXiv Detail & Related papers (2025-05-25T13:52:00Z)
ARFC-WAHNet: Adaptive Receptive Field Convolution and Wavelet-Attentive Hierarchical Network for Infrared Small Target Detection [2.643590634429843]
ARFC-WAHNet is an adaptive receptive field convolution and wavelet-attentive hierarchical network for infrared small target detection.<n>ARFC-WAHNet outperforms recent state-of-the-art methods in both detection accuracy and robustness.
arXiv Detail & Related papers (2025-05-15T09:44:23Z)
Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption [65.06388526722186]
Infrared-visible image fusion is a critical task in computer vision. There is a lack of recent comprehensive surveys that address this rapidly expanding domain. We introduce a multi-dimensional framework to elucidate common learning-based IVIF methods.
arXiv Detail & Related papers (2025-01-18T13:17:34Z)
Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification [62.425462136772666]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data. Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning. This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target Detection [46.049401912285134]
Infrared small target detection (IRSTD) has recently benefitted greatly from U-shaped neural models. Existing techniques struggle when the target has high similarities with the background. We present a Spatial-channel Cross Transformer Network (SCTransNet) that leverages spatial-channel cross transformer blocks.
arXiv Detail & Related papers (2024-01-28T06:41:15Z)
ILNet: Low-level Matters for Salient Infrared Small Target Detection [5.248337726304453]
Infrared small target detection is a technique for finding small targets from infrared clutter background. Due to the dearth of high-level semantic information, small infrared target features are weakened in the deep layers of the CNN. We propose an infrared low-level network (ILNet) that considers infrared small targets as salient areas with little semantic information.
arXiv Detail & Related papers (2023-09-24T14:09:37Z)
ABC: Attention with Bilinear Correlation for Infrared Small Target Detection [4.7379300868029395]
CNN based deep learning methods are not effective at segmenting infrared small target (IRST) We propose a new model called attention with bilinear correlation (ABC) ABC is based on the transformer architecture and includes a convolution linear fusion transformer (CLFT) module with a novel attention mechanism for feature extraction and fusion.
arXiv Detail & Related papers (2023-03-18T03:47:06Z)
Local Contrast and Global Contextual Information Make Infrared Small Object Salient Again [5.324958606516871]
Infrared small object detection (ISOS) aims to segment small objects only covered with several pixels from clutter background in infrared images. It's of great challenge due to: 1) small objects lack of sufficient intensity, shape and texture information; 2) small objects are easily lost in the process where detection models, say deep neural networks, obtain high-level semantic features and image-level receptive fields through successive downsampling. This paper proposes a reliable detection model for ISOS, dubbed UCFNet, which can handle well the two issues. Experiments on several public datasets demonstrate that our method significantly outperforms the state-of-the
arXiv Detail & Related papers (2023-01-28T05:18:13Z)
AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance. We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations. AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z)
FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding [14.896822373116729]
We present Few-Shot object detection via Contrastive proposals (FSCE) FSCE is a simple yet effective approach to learning contrastive-aware object encodings that facilitate the classification of detected objects. Our design outperforms current state-of-the-art works in any shot and all data, with up to +8.8% on standard benchmark PASCAL VOC and +2.7% on challenging benchmark.
arXiv Detail & Related papers (2021-03-10T09:15:05Z)
Suppress and Balance: A Simple Gated Network for Salient Object Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once. With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder. In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z)
Searching Central Difference Convolutional Networks for Face Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems. Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network. Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)
Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection. The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.