FUTransUNet-GradCAM: A Hybrid Transformer-U-Net with Self-Attention and Explainable Visualizations for Foot Ulcer Segmentation
- URL: http://arxiv.org/abs/2508.03758v2
- Date: Tue, 12 Aug 2025 01:47:00 GMT
- Title: FUTransUNet-GradCAM: A Hybrid Transformer-U-Net with Self-Attention and Explainable Visualizations for Foot Ulcer Segmentation
- Authors: Akwasi Asare, Mary Sagoe, Justice Williams Asare,
- Abstract summary: Automated segmentation of diabetic foot ulcers (DFUs) plays a critical role in clinical diagnosis, therapeutic planning, and longitudinal wound monitoring.<n>Traditional convolutional neural networks (CNNs) provide strong localization capabilities but struggle to model long-range spatial dependencies.<n>We propose FUTransUNet, a hybrid architecture that integrates the global attention mechanism of Vision Transformers (ViTs) into the U-Net framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated segmentation of diabetic foot ulcers (DFUs) plays a critical role in clinical diagnosis, therapeutic planning, and longitudinal wound monitoring. However, this task remains challenging due to the heterogeneous appearance, irregular morphology, and complex backgrounds associated with ulcer regions in clinical photographs. Traditional convolutional neural networks (CNNs), such as U-Net, provide strong localization capabilities but struggle to model long-range spatial dependencies due to their inherently limited receptive fields. To address this, we propose FUTransUNet, a hybrid architecture that integrates the global attention mechanism of Vision Transformers (ViTs) into the U-Net framework. This combination allows the model to extract global contextual features while maintaining fine-grained spatial resolution through skip connections and an effective decoding pathway. We trained and validated FUTransUNet on the public Foot Ulcer Segmentation Challenge (FUSeg) dataset. FUTransUNet achieved a training Dice Coefficient of 0.8679, an IoU of 0.7672, and a training loss of 0.0053. On the validation set, the model achieved a Dice Coefficient of 0.8751, an IoU of 0.7780, and a validation loss of 0.009045. To ensure clinical transparency, we employed Grad-CAM visualizations, which highlighted model focus areas during prediction. These quantitative outcomes clearly demonstrate that our hybrid approach successfully integrates global and local feature extraction paradigms, thereby offering a highly robust, accurate, explainable, and interpretable solution and clinically translatable solution for automated foot ulcer analysis. The approach offers a reliable, high-fidelity solution for DFU segmentation, with implications for improving real-world wound assessment and patient care.
Related papers
- A Unified Framework for Joint Detection of Lacunes and Enlarged Perivascular Spaces [3.9313804276175506]
Cerebral small vessel disease (CSVD) markers, specifically enlarged perivascular spaces (EPVS) and lacunae, present a unique challenge in medical image analysis.<n>We propose a morphologydecoupled framework where Zero- Gated CrossTask Attention exploits dense EPVS context to guide sparse lacune detection.
arXiv Detail & Related papers (2026-03-04T16:30:46Z) - Detection-Gated Glottal Segmentation with Zero-Shot Cross-Dataset Transfer and Clinical Feature Extraction [0.0]
We propose a detection-gated pipeline that integrates a YOLOv8-based detector with a U-Net segmenter.<n>The model was trained on a limited subset of the GIRAFE dataset (600 frames) and evaluated via zero-shot transfer on the large-scale BAGLS dataset.
arXiv Detail & Related papers (2026-03-02T17:05:41Z) - Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method [31.756744402295542]
The LiQA dataset is curated to benchmark algorithms for Liver (LiSeg) and Liver Fibrosis Staging (LiFS) under complex real-world conditions.<n>We describe the challenge's top-performing methodology, which integrates a semi-supervised learning framework with external data for robust segmentation.
arXiv Detail & Related papers (2025-12-08T15:44:24Z) - Enhanced SegNet with Integrated Grad-CAM for Interpretable Retinal Layer Segmentation in OCT Images [0.0]
This study proposes an improved SegNet-based deep learning framework for automated and interpretable retinal layer segmentation.<n> Architectural innovations, including modified pooling strategies, enhance feature extraction from noisy OCT images.<n>Grad-CAM visualizations highlighted anatomically relevant regions, aligning segmentation with clinical biomarkers.
arXiv Detail & Related papers (2025-09-09T14:31:51Z) - A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler [49.03919553747297]
We propose an AI-powered, real-time CoW auto-segmentation system capable of efficiently capturing cerebral arteries.<n>No prior studies have explored AI-driven cerebrovascular segmentation using Transcranial Color-coded Doppler (TCCD)<n>The proposed AAW-YOLO demonstrated strong performance in segmenting both ipsilateral and contralateral CoW vessels.
arXiv Detail & Related papers (2025-08-19T14:41:22Z) - HANS-Net: Hyperbolic Convolution and Adaptive Temporal Attention for Accurate and Generalizable Liver and Tumor Segmentation in CT Imaging [1.3149714289117207]
Accurate liver and tumor segmentation on abdominal CT images is critical for reliable diagnosis and treatment planning.<n>We introduce Hyperbolic-convolutions Adaptive-temporal-attention with Neural-representation and Synaptic-plasticity Network (HANS-Net)<n>HANS-Net combines hyperbolic convolutions for hierarchical geometric representation, a wavelet-inspired decomposition module for multi-scale texture learning, and an implicit neural representation branch.
arXiv Detail & Related papers (2025-07-15T13:56:37Z) - EAGLE: An Efficient Global Attention Lesion Segmentation Model for Hepatic Echinococcosis [31.698319244945793]
We propose a U-shaped network composed of a Progressive Visual State Space (PVSS) encoder and a Hybrid Visual State Space (HVSS) decoder.<n>The network achieves state-of-the-art performance with a Dice Similarity Coefficient (DSC) of 89.76%, surpassing MSVM-UNet by 1.61%.
arXiv Detail & Related papers (2025-06-25T11:42:05Z) - TUMLS: Trustful Fully Unsupervised Multi-Level Segmentation for Whole Slide Images of Histology [41.94295877935867]
We present a trustful fully unsupervised multi-level segmentation methodology (TUMLS) for whole slide images (WSIs)<n>TUMLS adopts an autoencoder (AE) as a feature extractor to identify the different tissue types within low-resolution training data.<n>This solution integrates seamlessly into clinicians, transforming the examination of a whole WSI into a review of concise, interpretable cross-level insights.
arXiv Detail & Related papers (2025-04-17T07:48:05Z) - Advancing Chronic Tuberculosis Diagnostics Using Vision-Language Models: A Multi modal Framework for Precision Analysis [0.0]
This study proposes a Vision-Language Model (VLM) to enhance automated chronic tuberculosis (TB) screening.<n>By integrating chest X-ray images with clinical data, the model addresses the challenges of manual interpretation.<n>The model demonstrated high precision (94 percent) and recall (94 percent) for detecting key chronic TB pathologies.
arXiv Detail & Related papers (2025-03-17T13:49:29Z) - A Cascaded Dilated Convolution Approach for Mpox Lesion Classification [0.0]
Mpox virus presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases.<n>Deep learning-based approaches for skin lesion classification offer a promising alternative.<n>This study introduces the Cascaded Atrous Group Attention framework to address these challenges.
arXiv Detail & Related papers (2024-12-13T12:47:30Z) - CoTCoNet: An Optimized Coupled Transformer-Convolutional Network with an Adaptive Graph Reconstruction for Leukemia Detection [0.3573481101204926]
We propose an optimized Coupled Transformer Convolutional Network (CoTCoNet) framework for the classification of leukemia.
Our framework captures comprehensive global features and scalable spatial patterns, enabling the identification of complex and large-scale hematological features.
It achieves remarkable accuracy and F1-Score rates of 0.9894 and 0.9893, respectively.
arXiv Detail & Related papers (2024-10-11T13:31:28Z) - Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring [0.0]
We propose a hybrid approach to segment common occlusions encountered in remote monitoring applications within PICUs.
Our approach centers on creating a deep-learning pipeline for limited training data scenarios.
The proposed framework yields an overall classification performance with 92.5% accuracy, 93.8% recall, 90.3% precision, and 92.0% F1-score.
arXiv Detail & Related papers (2024-07-18T09:37:55Z) - Uncertainty-guided annotation enhances segmentation with the human-in-the-loop [5.669636524329784]
Uncertainty-Guided.
(UGA) introduces a human-in-the-loop approach, enabling AI to convey its uncertainties to clinicians.
UGA eases this interaction by quantifying uncertainty at the pixel level, thereby revealing the model's limitations.
To foster broader application and community contribution, we have made our code accessible.
arXiv Detail & Related papers (2024-02-16T16:41:15Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection
Segmentation System [69.40329819373954]
The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world.
At the current stage, automatically segmenting the lung infection area from CT images is essential for the diagnosis and treatment of COVID-19.
We propose a boundary guided semantic learning network (BSNet) in this paper.
arXiv Detail & Related papers (2022-09-07T05:01:38Z) - Fuzzy Attention Neural Network to Tackle Discontinuity in Airway
Segmentation [67.19443246236048]
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases.
Some small-sized airway branches (e.g., bronchus and terminaloles) significantly aggravate the difficulty of automatic segmentation.
This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function.
arXiv Detail & Related papers (2022-09-05T16:38:13Z) - An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem.
We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z) - Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Images [152.34988415258988]
Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19.
segmenting infected regions from CT slices faces several challenges, including high variation in infection characteristics, and low intensity contrast between infections and normal tissues.
To address these challenges, a novel COVID-19 Deep Lung Infection Network (Inf-Net) is proposed to automatically identify infected regions from chest CT slices.
arXiv Detail & Related papers (2020-04-22T07:30:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.