Related papers: ConMatFormer: A Multi-attention and Transformer Integrated ConvNext based Deep Learning Model for Enhanced Diabetic Foot Ulcer Classification

ConMatFormer: A Multi-attention and Transformer Integrated ConvNext based Deep Learning Model for Enhanced Diabetic Foot Ulcer Classification

URL: http://arxiv.org/abs/2510.22743v1
Date: Sun, 26 Oct 2025 16:34:43 GMT
Title: ConMatFormer: A Multi-attention and Transformer Integrated ConvNext based Deep Learning Model for Enhanced Diabetic Foot Ulcer Classification
Authors: Raihan Ahamed Rifat, Fuyad Hasan Bhoyan, Md Humaion Kabir Mehedi, Md Kaviul Hossain, Md. Jakir Hossen, M. F. Mridha,
Abstract summary: We propose ConMatFormer, a new hybrid deep learning architecture that combines ConvNeXt blocks, multiple attention mechanisms and transformer modules.<n>Tests showed that ConMatFormer outperformed state-of-the-art (SOTA) convolutional neural network (CNN) and Vision Transformer (ViT) models in terms of accuracy, reliability, and flexibility.<n>Our findings set a new benchmark for DFU classification and provide a hybrid attention transformer framework for medical image analysis.
Score: 0.21990652930491858
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diabetic foot ulcer (DFU) detection is a clinically significant yet challenging task due to the scarcity and variability of publicly available datasets. To solve these problems, we propose ConMatFormer, a new hybrid deep learning architecture that combines ConvNeXt blocks, multiple attention mechanisms convolutional block attention module (CBAM) and dual attention network (DANet), and transformer modules in a way that works together. This design facilitates the extraction of better local features and understanding of the global context, which allows us to model small skin patterns across different types of DFU very accurately. To address the class imbalance, we used data augmentation methods. A ConvNeXt block was used to obtain detailed local features in the initial stages. Subsequently, we compiled the model by adding a transformer module to enhance long-range dependency. This enabled us to pinpoint the DFU classes that were underrepresented or constituted minorities. Tests on the DS1 (DFUC2021) and DS2 (diabetic foot ulcer (DFU)) datasets showed that ConMatFormer outperformed state-of-the-art (SOTA) convolutional neural network (CNN) and Vision Transformer (ViT) models in terms of accuracy, reliability, and flexibility. The proposed method achieved an accuracy of 0.8961 and a precision of 0.9160 in a single experiment, which is a significant improvement over the current standards for classifying DFUs. In addition, by 4-fold cross-validation, the proposed model achieved an accuracy of 0.9755 with a standard deviation of only 0.0031. We further applied explainable artificial intelligence (XAI) methods, such as Grad-CAM, Grad-CAM++, and LIME, to consistently monitor the transparency and trustworthiness of the decision-making process.. Our findings set a new benchmark for DFU classification and provide a hybrid attention transformer framework for medical image analysis.

Related papers

AVP-Fusion: Adaptive Multi-Modal Fusion and Contrastive Learning for Two-Stage Antiviral Peptide Identification [2.242125769416219]
AVP-Fusion is a novel two-stage deep learning framework integrating adaptive feature fusion and contrastive learning.<n>It achieves an accuracy of 0.9531 and an MCC of 0.9064, significantly outperforming state-of-the-art methods.<n>In the second stage, leveraging transfer learning, the model enables precise subclass prediction for six viral families and eight specific viruses.
arXiv Detail & Related papers (2025-12-25T07:29:39Z)
Detection of retinal diseases using an accelerated reused convolutional network [3.5808917363708743]
We create a new general model that incorporates our novel convolutional layer named ArConv layers.<n>The model has suitable complexity for use in mobile phones and can perform the task of diagnosing the presence of disease with high accuracy.
arXiv Detail & Related papers (2025-10-05T14:44:09Z)
MitoDetect++: A Domain-Robust Pipeline for Mitosis Detection and Atypical Subtyping [6.267650544824679]
MitoDetect++ is a unified deep learning pipeline designed for the MIDOG 2025 challenge.<n>For detection, we employ a U-Net-based encoder-decoder architecture with EfficientNetV2-L as the backbone.<n>For classification, we leverage the Virchow2 vision transformer, fine-tuned efficiently using Low-Rank Adaptation (LoRA) to minimize resource consumption.
arXiv Detail & Related papers (2025-08-28T18:19:51Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing [5.3598912592106345]
Deep learning has led to significant advances in bearing fault diagnosis (FD) We propose a novel FD model by integrating multiscale quaternion convolutional neural network (MQCNN), bidirectional gated recurrent unit (BiG), and cross self-attention feature fusion (CSAFF)
arXiv Detail & Related papers (2024-05-25T07:55:02Z)
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions. We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training. Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Learning Interpretable Microscopic Features of Tumor by Multi-task Adversarial CNNs To Improve Generalization [1.7371375427784381]
Existing CNN models act as black boxes, not ensuring to the physicians that important diagnostic features are used by the model. Here we show that our architecture, by learning end-to-end an uncertainty-based weighting combination of multi-task and adversarial losses, is encouraged to focus on pathology features. Our results on breast lymph node tissue show significantly improved generalization in the detection of tumorous tissue, with best average AUC 0.89 (0.01) against the baseline AUC 0.86 (0.005)
arXiv Detail & Related papers (2020-08-04T12:10:35Z)
A Systematic Approach to Featurization for Cancer Drug Sensitivity Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques. We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production. One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs) We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.