Related papers: A Cascaded Dilated Convolution Approach for Mpox Lesion Classification

A Cascaded Dilated Convolution Approach for Mpox Lesion Classification

URL: http://arxiv.org/abs/2412.10106v4
Date: Tue, 14 Jan 2025 03:43:02 GMT
Title: A Cascaded Dilated Convolution Approach for Mpox Lesion Classification
Authors: Ayush Deshmukh,
Abstract summary: Mpox virus presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases.<n>Deep learning-based approaches for skin lesion classification offer a promising alternative.<n>This study introduces the Cascaded Atrous Group Attention framework to address these challenges.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The global outbreak of the Mpox virus, classified as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization, presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases. Traditional diagnostic methods for Mpox, which rely on clinical symptoms and laboratory tests, are slow and labor intensive. Deep learning-based approaches for skin lesion classification offer a promising alternative. However, developing a model that balances efficiency with accuracy is crucial to ensure reliable and timely diagnosis without compromising performance. This study introduces the Cascaded Atrous Group Attention (CAGA) framework to address these challenges, combining the Cascaded Atrous Attention module and the Cascaded Group Attention mechanism. The Cascaded Atrous Attention module utilizes dilated convolutions and cascades the outputs to enhance multi-scale representation. This is integrated into the Cascaded Group Attention mechanism, which reduces redundancy in Multi-Head Self-Attention. By integrating the Cascaded Atrous Group Attention module with EfficientViT-L1 as the backbone architecture, this approach achieves state-of-the-art performance, reaching an accuracy of 98% on the Mpox Close Skin Image (MCSI) dataset while reducing model parameters by 37.5% compared to the original EfficientViT-L1. The model's robustness is demonstrated through extensive validation on two additional benchmark datasets, where it consistently outperforms existing approaches.

Related papers

CSASN: A Multitask Attention-Based Framework for Heterogeneous Thyroid Carcinoma Classification in Ultrasound Images [4.577163442985675]
Heterogeneous morphological features and data imbalance pose significant challenges in rare thyroid carcinoma classification using ultrasound imaging.<n>We propose a novel multitask learning framework, Channel-Spatial Attention Synergy Network (CSASN), which integrates a dual-branch feature extractor.
arXiv Detail & Related papers (2025-05-04T18:23:03Z)
Efficient Epistemic Uncertainty Estimation in Cerebrovascular Segmentation [1.3980986259786223]
We introduce an efficient ensemble model combining the advantages of Bayesian Approximation and Deep Ensembles. Areas of high model uncertainty and erroneous predictions are aligned which demonstrates the effectiveness and reliability of the approach.
arXiv Detail & Related papers (2025-03-28T09:39:37Z)
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification [46.89908887119571]
Whole Slide Image (WSI) classification poses unique challenges due to the vast image size and numerous non-informative regions. We propose MExD, an Expert-Infused Diffusion Model that combines the strengths of a Mixture-of-Experts (MoE) mechanism with a diffusion model for enhanced classification.
arXiv Detail & Related papers (2025-03-16T08:04:17Z)
RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering [13.423253964156117]
RURANET++ is an unsupervised learning-based automated diagnostic system for Diabetic Macular Edema (DME) During feature processing, a pre-trained GoogLeNet model extracts deep features from retinal images, followed by PCA-based dimensionality reduction to 50 dimensions for computational efficiency. Experimental results demonstrate superior performance across multiple metrics, achieving maximum accuracy (0.8411), precision (0.8593), recall (0.8411), and F1-score, with exceptional clustering quality.
arXiv Detail & Related papers (2025-02-27T16:06:57Z)
ActiveSSF: An Active-Learning-Guided Self-Supervised Framework for Long-Tailed Megakaryocyte Classification [3.6535793744942318]
We propose the ActiveSSF framework, which integrates active learning with self-supervised pretraining. Experimental results on clinical megakaryocyte datasets demonstrate that ActiveSSF achieves state-of-the-art performance. To foster further research, the code and datasets will be publicly released in the future.
arXiv Detail & Related papers (2025-02-12T08:24:36Z)
Efficient and Comprehensive Feature Extraction in Large Vision-Language Model for Clinical Pathology Analysis [34.199766079609795]
Pathological diagnosis is vital for determining disease characteristics, guiding treatment, and assessing prognosis.<n>Traditional pure vision models face challenges of redundant feature extraction.<n>Existing large vision-language models (LVLMs) are limited by input resolution constraints, hindering their efficiency and accuracy.<n>We propose two innovative strategies: the mixed task-guided feature enhancement, and the prompt-guided detail feature completion.
arXiv Detail & Related papers (2024-12-12T18:07:23Z)
Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology [6.418265127069878]
We propose the use of omic embeddings during early and late fusion to capture complementary information from local (patch-level) to global (slide-level) interactions. This dual fusion strategy enhances interpretability and classification performance, highlighting its potential for clinical diagnostics.
arXiv Detail & Related papers (2024-11-26T13:25:53Z)
EfficientNet with Hybrid Attention Mechanisms for Enhanced Breast Histopathology Classification: A Comprehensive Approach [0.0]
This paper introduces a novel approach integrating Hybrid EfficientNet models with advanced attention mechanisms to enhance feature extraction and focus on critical image regions. We evaluate the performance of our models across multiple magnification scales using publicly available hispathology datasets. The results are validated using metrics such as accuracy, F1-score, precision, and recall, demonstrating the clinical potential of our model in improving diagnostic accuracy.
arXiv Detail & Related papers (2024-10-29T17:56:05Z)
Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models [48.87160158792048]
We introduce a cascaded amortized latent diffusion model (CA-LDM) that can synthesis high-resolution OCT volumes in a memory-efficient way. Experiments on a public high-resolution OCT dataset show that our synthetic data have realistic high-resolution and global features, surpassing the capabilities of existing methods.
arXiv Detail & Related papers (2024-05-26T10:58:22Z)
Super-resolution of biomedical volumes with 2D supervision [84.5255884646906]
Masked slice diffusion for super-resolution exploits the inherent equivalence in the data-generating distribution across all spatial dimensions of biological specimens. We focus on the application of SliceR to stimulated histology (SRH), characterized by its rapid acquisition of high-resolution 2D images but slow and costly optical z-sectioning.
arXiv Detail & Related papers (2024-04-15T02:41:55Z)
Monkeypox disease recognition model based on improved SE-InceptionV3 [0.0]
This study introduces an improved SE-InceptionV3 model, embedding the SENet module and incorporating L2 regularization into the InceptionV3 framework to enhance monkeypox disease detection. Our model demonstrates a noteworthy accuracy of 96.71% on the test set, outperforming conventional methods and deep learning models.
arXiv Detail & Related papers (2024-03-15T08:01:44Z)
Spatial-aware Transformer-GRU Framework for Enhanced Glaucoma Diagnosis from 3D OCT Imaging [1.8416014644193066]
We present a novel deep learning framework that leverages the diagnostic value of 3D Optical Coherence Tomography ( OCT) imaging for automated glaucoma detection. We integrate a pre-trained Vision Transformer on retinal data for rich slice-wise feature extraction and a bidirectional Gated Recurrent Unit for capturing inter-slice spatial dependencies. Experimental results on a large dataset demonstrate the superior performance of the proposed method over state-of-the-art ones.
arXiv Detail & Related papers (2024-03-08T22:25:15Z)
OCT-SelfNet: A Self-Supervised Framework with Multi-Modal Datasets for Generalized and Robust Retinal Disease Detection [2.3349787245442966]
Our research contributes a self-supervised robust machine learning framework, OCT-SelfNet, for detecting eye diseases. Our method addresses the issue using a two-phase training approach that combines self-supervised pretraining and supervised fine-tuning. In terms of the AUC-PR metric, our proposed method exceeded 42%, showcasing a substantial increase of at least 10% in performance compared to the baseline.
arXiv Detail & Related papers (2024-01-22T20:17:14Z)
Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites: A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area. We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions. We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z)
Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues. The CARE framework needs bounding boxes to represent the lesion regions of rare diseases. Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z)
Automatic diagnosis of knee osteoarthritis severity using Swin transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint. We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z)
Brain Imaging-to-Graph Generation using Adversarial Hierarchical Diffusion Models for MCI Causality Analysis [44.45598796591008]
Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis. The hierarchical transformers in the generator are designed to estimate the noise at multiple scales. Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
arXiv Detail & Related papers (2023-05-18T06:54:56Z)
Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection Segmentation System [69.40329819373954]
The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world. At the current stage, automatically segmenting the lung infection area from CT images is essential for the diagnosis and treatment of COVID-19. We propose a boundary guided semantic learning network (BSNet) in this paper.
arXiv Detail & Related papers (2022-09-07T05:01:38Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Statistical control for spatio-temporal MEG/EEG source imaging with desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques. The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge. We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.