FUTransUNet-GradCAM: A Hybrid Transformer-U-Net with Self-Attention and Explainable Visualizations for Foot Ulcer Segmentation
- URL: http://arxiv.org/abs/2508.03758v1
- Date: Mon, 04 Aug 2025 11:05:14 GMT
- Title: FUTransUNet-GradCAM: A Hybrid Transformer-U-Net with Self-Attention and Explainable Visualizations for Foot Ulcer Segmentation
- Authors: Akwasi Asare, Mary Sagoe, Justice Williams Asare,
- Abstract summary: Automated segmentation of diabetic foot ulcers (DFUs) plays a critical role in clinical diagnosis, therapeutic planning, and longitudinal wound monitoring.<n>Traditional convolutional neural networks (CNNs) provide strong localization capabilities but struggle to model long-range spatial dependencies.<n>We propose FUTransUNet, a hybrid architecture that integrates the global attention mechanism of Vision Transformers (ViTs) into the U-Net framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated segmentation of diabetic foot ulcers (DFUs) plays a critical role in clinical diagnosis, therapeutic planning, and longitudinal wound monitoring. However, this task remains challenging due to the heterogeneous appearance, irregular morphology, and complex backgrounds associated with ulcer regions in clinical photographs. Traditional convolutional neural networks (CNNs), such as U-Net, provide strong localization capabilities but struggle to model long-range spatial dependencies due to their inherently limited receptive fields. To address this, we propose FUTransUNet, a hybrid architecture that integrates the global attention mechanism of Vision Transformers (ViTs) into the U-Net framework. This combination allows the model to extract global contextual features while maintaining fine-grained spatial resolution through skip connections and an effective decoding pathway. We trained and validated FUTransUNet on the public Foot Ulcer Segmentation Challenge (FUSeg) dataset. FUTransUNet achieved a training Dice Coefficient of 0.8679, an IoU of 0.7672, and a training loss of 0.0053. On the validation set, the model achieved a Dice Coefficient of 0.8751, an IoU of 0.7780, and a validation loss of 0.009045. To ensure clinical transparency, we employed Grad-CAM visualizations, which highlighted model focus areas during prediction. These quantitative outcomes clearly demonstrate that our hybrid approach successfully integrates global and local feature extraction paradigms, thereby offering a highly robust, accurate, explainable, and interpretable solution and clinically translatable solution for automated foot ulcer analysis. The approach offers a reliable, high-fidelity solution for DFU segmentation, with implications for improving real-world wound assessment and patient care.
Related papers
- HANS-Net: Hyperbolic Convolution and Adaptive Temporal Attention for Accurate and Generalizable Liver and Tumor Segmentation in CT Imaging [1.3149714289117207]
Accurate liver and tumor segmentation on abdominal CT images is critical for reliable diagnosis and treatment planning.<n>We introduce Hyperbolic-convolutions Adaptive-temporal-attention with Neural-representation and Synaptic-plasticity Network (HANS-Net)<n>HANS-Net combines hyperbolic convolutions for hierarchical geometric representation, a wavelet-inspired decomposition module for multi-scale texture learning, and an implicit neural representation branch.
arXiv Detail & Related papers (2025-07-15T13:56:37Z) - Advancing Chronic Tuberculosis Diagnostics Using Vision-Language Models: A Multi modal Framework for Precision Analysis [0.0]
This study proposes a Vision-Language Model (VLM) to enhance automated chronic tuberculosis (TB) screening.<n>By integrating chest X-ray images with clinical data, the model addresses the challenges of manual interpretation.<n>The model demonstrated high precision (94 percent) and recall (94 percent) for detecting key chronic TB pathologies.
arXiv Detail & Related papers (2025-03-17T13:49:29Z) - A Cascaded Dilated Convolution Approach for Mpox Lesion Classification [0.0]
Mpox virus presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases.<n>Deep learning-based approaches for skin lesion classification offer a promising alternative.<n>This study introduces the Cascaded Atrous Group Attention framework to address these challenges.
arXiv Detail & Related papers (2024-12-13T12:47:30Z) - CoTCoNet: An Optimized Coupled Transformer-Convolutional Network with an Adaptive Graph Reconstruction for Leukemia Detection [0.3573481101204926]
We propose an optimized Coupled Transformer Convolutional Network (CoTCoNet) framework for the classification of leukemia.
Our framework captures comprehensive global features and scalable spatial patterns, enabling the identification of complex and large-scale hematological features.
It achieves remarkable accuracy and F1-Score rates of 0.9894 and 0.9893, respectively.
arXiv Detail & Related papers (2024-10-11T13:31:28Z) - Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring [0.0]
We propose a hybrid approach to segment common occlusions encountered in remote monitoring applications within PICUs.
Our approach centers on creating a deep-learning pipeline for limited training data scenarios.
The proposed framework yields an overall classification performance with 92.5% accuracy, 93.8% recall, 90.3% precision, and 92.0% F1-score.
arXiv Detail & Related papers (2024-07-18T09:37:55Z) - Uncertainty-guided annotation enhances segmentation with the human-in-the-loop [5.669636524329784]
Uncertainty-Guided.
(UGA) introduces a human-in-the-loop approach, enabling AI to convey its uncertainties to clinicians.
UGA eases this interaction by quantifying uncertainty at the pixel level, thereby revealing the model's limitations.
To foster broader application and community contribution, we have made our code accessible.
arXiv Detail & Related papers (2024-02-16T16:41:15Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection
Segmentation System [69.40329819373954]
The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world.
At the current stage, automatically segmenting the lung infection area from CT images is essential for the diagnosis and treatment of COVID-19.
We propose a boundary guided semantic learning network (BSNet) in this paper.
arXiv Detail & Related papers (2022-09-07T05:01:38Z) - Fuzzy Attention Neural Network to Tackle Discontinuity in Airway
Segmentation [67.19443246236048]
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases.
Some small-sized airway branches (e.g., bronchus and terminaloles) significantly aggravate the difficulty of automatic segmentation.
This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function.
arXiv Detail & Related papers (2022-09-05T16:38:13Z) - An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem.
We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z) - Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Images [152.34988415258988]
Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19.
segmenting infected regions from CT slices faces several challenges, including high variation in infection characteristics, and low intensity contrast between infections and normal tissues.
To address these challenges, a novel COVID-19 Deep Lung Infection Network (Inf-Net) is proposed to automatically identify infected regions from chest CT slices.
arXiv Detail & Related papers (2020-04-22T07:30:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.