Related papers: Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation

Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation

URL: http://arxiv.org/abs/2406.07952v2
Date: Mon, 19 Aug 2024 14:56:05 GMT
Title: Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation
Authors: Zhenhuan Zhou, Along He, Yanlin Wu, Rui Yao, Xueshuo Xie, Tao Li,
Abstract summary: In medical images, various types of lesions often manifest significant differences in their shape and texture. Accurate medical image segmentation demands deep learning models with robust capabilities in multi-scale and boundary feature learning. We introduce SF-UNet, a spatial-frequency dual-domain attention network. It comprises two main components: the Multi-scale Progressive Channel Attention (MPCA) block, which progressively extract multi-scale features across adjacent encoder layers, and the lightweight Frequency-Spatial Attention (FSA) block, with only 0.05M parameters.
Score: 11.60636221012585
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In medical images, various types of lesions often manifest significant differences in their shape and texture. Accurate medical image segmentation demands deep learning models with robust capabilities in multi-scale and boundary feature learning. However, previous networks still have limitations in addressing the above issues. Firstly, previous networks simultaneously fuse multi-level features or employ deep supervision to enhance multi-scale learning. However, this may lead to feature redundancy and excessive computational overhead, which is not conducive to network training and clinical deployment. Secondly, the majority of medical image segmentation networks exclusively learn features in the spatial domain, disregarding the abundant global information in the frequency domain. This results in a bias towards low-frequency components, neglecting crucial high-frequency information. To address these problems, we introduce SF-UNet, a spatial-frequency dual-domain attention network. It comprises two main components: the Multi-scale Progressive Channel Attention (MPCA) block, which progressively extract multi-scale features across adjacent encoder layers, and the lightweight Frequency-Spatial Attention (FSA) block, with only 0.05M parameters, enabling concurrent learning of texture and boundary features from both spatial and frequency domains. We validate the effectiveness of the proposed SF-UNet on three public datasets. Experimental results show that compared to previous state-of-the-art (SOTA) medical image segmentation networks, SF-UNet achieves the best performance, and achieves up to 9.4\% and 10.78\% improvement in DSC and IOU. Codes will be released at https://github.com/nkicsl/SF-UNet.

Related papers

Wavelet-Guided Dual-Frequency Encoding for Remote Sensing Change Detection [67.84730634802204]
Change detection in remote sensing imagery plays a vital role in various engineering applications, such as natural disaster monitoring, urban expansion tracking, and infrastructure management.<n>Most existing methods still rely on spatial-domain modeling, where the limited diversity of feature representations hinders the detection of subtle change regions.<n>We observe that frequency-domain feature modeling particularly in the wavelet domain amplify fine-grained differences in frequency components, enhancing the perception of edge changes that are challenging to capture in the spatial domain.
arXiv Detail & Related papers (2025-08-07T11:14:16Z)
CENet: Context Enhancement Network for Medical Image Segmentation [3.4690322157094573]
We propose the Context Enhancement Network (CENet), a novel segmentation framework featuring two key innovations.<n>First, the Dual Selective Enhancement Block (DSEB) integrated into skip connections enhances boundary details and improves the detection of smaller organs in a context-aware manner.<n>Second, the Context Feature Attention Module (CFAM) in the decoder employs a multi-scale design to maintain spatial integrity, reduce feature redundancy, and mitigate overly enhanced representations.
arXiv Detail & Related papers (2025-05-23T23:22:18Z)
FreqU-FNet: Frequency-Aware U-Net for Imbalanced Medical Image Segmentation [0.0]
FreqU-FNet is a novel U-shaped segmentation architecture operating in the frequency domain.<n>Our framework incorporates a Frequency that leverages Low-Pass Convolution and Daubechies wavelet-based downsampling.<n>Experiments on multiple medical segmentation benchmarks demonstrate that FreqU-FNet consistently outperforms both CNN and Transformer baselines.
arXiv Detail & Related papers (2025-05-23T06:51:24Z)
FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system. We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain. To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB) We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z)
QTSeg: A Query Token-Based Dual-Mix Attention Framework with Multi-Level Feature Distribution for Medical Image Segmentation [13.359001333361272]
Medical image segmentation plays a crucial role in assisting healthcare professionals with accurate diagnoses and enabling automated diagnostic processes. Traditional convolutional neural networks (CNNs) often struggle with capturing long-range dependencies, while transformer-based architectures come with increased computational complexity. Recent efforts have focused on combining CNNs and transformers to balance performance and efficiency, but existing approaches still face challenges in achieving high segmentation accuracy while maintaining low computational costs. We propose QTSeg, a novel architecture for medical image segmentation that effectively integrates local and global information.
arXiv Detail & Related papers (2024-12-23T03:22:44Z)
LMBF-Net: A Lightweight Multipath Bidirectional Focal Attention Network for Multifeatures Segmentation [15.091476025563528]
Retinal diseases can cause irreversible vision loss in both eyes if not diagnosed and treated early. Current deep learning techniques for segmenting retinal images with many labels and attributes have poor detection accuracy and generalisability. This paper presents a multipath convolutional neural network for multifeature segmentation.
arXiv Detail & Related papers (2024-07-03T07:37:09Z)
ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation. SAM's Transformer-based structure prioritizes global and low-frequency information. CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z)
Unlocking Fine-Grained Details with Wavelet-based High-Frequency Enhancement in Transformers [4.208461204572879]
Medical image segmentation is a critical task that plays a vital role in diagnosis, treatment planning, and disease monitoring. We address the local feature deficiency of the Transformer model by carefully re-designing the self-attention map. We propose a multi-scale context enhancement block within skip connections to adaptively model inter-scale dependencies.
arXiv Detail & Related papers (2023-08-25T15:42:19Z)
Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing. The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal. The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z)
M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image. Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z)
SF2Former: Amyotrophic Lateral Sclerosis Identification From Multi-center MRI Data Using Spatial and Frequency Fusion Transformer [3.408266725482757]
Amyotrophic Lateral Sclerosis (ALS) is a complex neurodegenerative disorder involving motor neuron degeneration. Deep learning has turned into a prominent class of machine learning programs in computer vision. This study introduces a framework named SF2Former that leverages vision transformer architecture's power to distinguish the ALS subjects from the control group.
arXiv Detail & Related papers (2023-02-21T18:16:20Z)
HistoSeg : Quick attention with multi-loss function for multi-structure segmentation in digital histology images [0.696194614504832]
Medical image segmentation assists in computer-aided diagnosis, surgeries, and treatment. We proposed an generalization-Decoder Network, Quick Attention Module and a Multi Loss Function. We evaluate the capability of our proposed network on two publicly available datasets for medical image segmentation MoNuSeg and GlaS.
arXiv Detail & Related papers (2022-09-01T21:10:00Z)
Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists. We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z)
Deep Frequency Filtering for Domain Generalization [55.66498461438285]
Deep Neural Networks (DNNs) have preferences for some frequency components in the learning process. We propose Deep Frequency Filtering (DFF) for learning domain-generalizable features. We show that applying our proposed DFF on a plain baseline outperforms the state-of-the-art methods on different domain generalization tasks.
arXiv Detail & Related papers (2022-03-23T05:19:06Z)
Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection [89.43987367139724]
Face forgery detection is raising ever-increasing interest in computer vision. Recent works have reached sound achievements, but there are still unignorable problems. A novel frequency-aware discriminative feature learning framework is proposed in this paper.
arXiv Detail & Related papers (2021-03-16T14:17:17Z)
Boundary-aware Context Neural Network for Medical Image Segmentation [15.585851505721433]
Medical image segmentation can provide reliable basis for further clinical analysis and disease diagnosis. Most existing CNNs-based methods produce unsatisfactory segmentation mask without accurate object boundaries. In this paper, we formulate a boundary-aware context neural network (BA-Net) for 2D medical image segmentation.
arXiv Detail & Related papers (2020-05-03T02:35:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.