ScaleFusionNet: Transformer-Guided Multi-Scale Feature Fusion for Skin Lesion Segmentation
- URL: http://arxiv.org/abs/2503.03327v3
- Date: Wed, 02 Jul 2025 14:47:33 GMT
- Title: ScaleFusionNet: Transformer-Guided Multi-Scale Feature Fusion for Skin Lesion Segmentation
- Authors: Saqib Qamar, Syed Furqan Qadri, Roobaea Alroobaea, Goram Mufarah M Alshmrani, Richard Jiang,
- Abstract summary: Melanoma is a malignant tumor that originates from skin cell lesions.<n> Accurate and efficient segmentation of skin lesions is essential for quantitative analysis.<n>We propose ScaleFusionNet to enhance feature extraction and fusion by capturing both local and global features.
- Score: 1.6361082730202214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Melanoma is a malignant tumor that originates from skin cell lesions. Accurate and efficient segmentation of skin lesions is essential for quantitative analysis but remains a challenge due to blurred lesion boundaries, gradual color changes, and irregular shapes. To address this, we propose ScaleFusionNet, a hybrid model that integrates a Cross-Attention Transformer Module (CATM) and adaptive fusion block (AFB) to enhance feature extraction and fusion by capturing both local and global features. We introduce CATM, which utilizes Swin transformer blocks and Cross Attention Fusion (CAF) to adaptively refine feature fusion and reduce semantic gaps in the encoder-decoder to improve segmentation accuracy. Additionally, the AFB uses Swin Transformer-based attention and deformable convolution-based adaptive feature extraction to help the model gather local and global contextual information through parallel pathways. This enhancement refines the lesion boundaries and preserves fine-grained details. ScaleFusionNet achieves Dice scores of 92.94\% and 91.80\% on the ISIC-2016 and ISIC-2018 datasets, respectively, demonstrating its effectiveness in skin lesion analysis. Simultaneously, independent validation experiments were conducted on the PH$^2$ dataset using the pretrained model weights. The results show that ScaleFusionNet demonstrates significant performance improvements compared with other state-of-the-art methods. Our code implementation is publicly available at GitHub.
Related papers
- CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction [1.8720735308601646]
Cancer survival prediction from whole slide images (WSIs) is a challenging task in computational pathology.<n>We propose CrossFusion, a novel multi-scale feature integration framework.<n>By effectively modeling both scale-specific patterns and their interactions, CrossFusion generates a rich feature set that enhances survival prediction accuracy.
arXiv Detail & Related papers (2025-03-03T21:34:52Z) - GLoG-CSUnet: Enhancing Vision Transformers with Adaptable Radiomic Features for Medical Image Segmentation [2.294915015129229]
Vision Transformers (ViTs) have shown promise in medical image semantic segmentation (MISS)<n>We introduce Gabor and Laplacian of Gaussian Convolutional Swin Network (GLoG-CSUnet)<n>GLoG-CSUnet is a novel architecture enhancing Transformer-based models by incorporating learnable radiomic features.
arXiv Detail & Related papers (2025-01-06T06:07:40Z) - FIAS: Feature Imbalance-Aware Medical Image Segmentation with Dynamic Fusion and Mixing Attention [11.385231493066312]
hybrid architecture that combine convolutional neural networks (CNNs) and transformers demonstrates competitive ability in medical image segmentation.
We propose a Feaure Imbalance-Aware (FIAS) network, which incorporates a dual-path encoder and a novel Mixing Attention (MixAtt) decoder.
arXiv Detail & Related papers (2024-11-16T20:30:44Z) - SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance [0.559239450391449]
Skin lesion segmentation is a crucial method for identifying early skin cancer.
We propose a hybrid architecture based on Mamba and CNN, called SkinMamba.
It maintains linear complexity while offering powerful long-range dependency modeling and local feature extraction capabilities.
arXiv Detail & Related papers (2024-09-17T05:02:38Z) - AFFSegNet: Adaptive Feature Fusion Segmentation Network for Microtumors and Multi-Organ Segmentation [31.97835089989928]
Medical image segmentation is a crucial task in computer vision, supporting clinicians in diagnosis, treatment planning, and disease monitoring.<n>We propose the Adaptive Semantic Network (ASSNet), a transformer architecture that effectively integrates local and global features for precise medical image segmentation.<n>Tests on diverse medical image segmentation tasks, including multi-organ, liver tumor, and bladder tumor segmentation, demonstrate that ASSNet achieves state-of-the-art results.
arXiv Detail & Related papers (2024-09-12T06:25:44Z) - Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers.
The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Inter-Scale Dependency Modeling for Skin Lesion Segmentation with
Transformer-based Networks [0.0]
Melanoma is a dangerous form of skin cancer caused by the abnormal growth of skin cells.
FCN approaches, including the U-Net architecture, can automatically segment skin lesions to aid diagnosis.
The symmetrical U-Net model has shown outstanding results, but its use of a convolutional operation limits its ability to capture long-range dependencies.
arXiv Detail & Related papers (2023-10-20T16:20:25Z) - Skin Lesion Segmentation Improved by Transformer-based Networks with
Inter-scale Dependency Modeling [0.0]
Melanoma is a dangerous type of skin cancer resulting from abnormal skin cell growth.
The symmetrical U-Net model's reliance on convolutional operations hinders its ability to capture long-range dependencies.
Several Transformer-based U-Net topologies have recently been created to overcome this limitation.
arXiv Detail & Related papers (2023-10-20T15:53:51Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution.
nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - Lesion Net -- Skin Lesion Segmentation Using Coordinate Convolution and
Deep Residual Units [18.908448254745473]
The accuracy of segmenting melanomas skin lesions is quite a challenging task due to less data for training, irregular shapes, unclear boundaries, and different skin colors.
Our proposed approach helps in improving the accuracy of skin lesion segmentation.
The results show that the proposed model either outperform or at par with the existing skin lesion segmentation methods.
arXiv Detail & Related papers (2020-12-28T14:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.