Enhancing DR Classification with Swin Transformer and Shifted Window Attention
- URL: http://arxiv.org/abs/2504.15317v1
- Date: Sun, 20 Apr 2025 13:23:20 GMT
- Title: Enhancing DR Classification with Swin Transformer and Shifted Window Attention
- Authors: Meher Boulaabi, Takwa Ben Aïcha Gader, Afef Kacem Echi, Zied Bouraoui,
- Abstract summary: Diabetic retinopathy (DR) is a leading cause of blindness worldwide, underscoring the importance of early detection for effective treatment.<n>We propose a robust preprocessing pipeline incorporating image cropping, Contrast-Limited Adaptive Histogram Equalization (CLAHE), and targeted data augmentation to improve model generalization and resilience.<n>We validate our method on the Aptos and IDRiD datasets for multi-class DR classification, achieving accuracy rates of 89.65% and 97.40%, respectively.
- Score: 9.99302279736049
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diabetic retinopathy (DR) is a leading cause of blindness worldwide, underscoring the importance of early detection for effective treatment. However, automated DR classification remains challenging due to variations in image quality, class imbalance, and pixel-level similarities that hinder model training. To address these issues, we propose a robust preprocessing pipeline incorporating image cropping, Contrast-Limited Adaptive Histogram Equalization (CLAHE), and targeted data augmentation to improve model generalization and resilience. Our approach leverages the Swin Transformer, which utilizes hierarchical token processing and shifted window attention to efficiently capture fine-grained features while maintaining linear computational complexity. We validate our method on the Aptos and IDRiD datasets for multi-class DR classification, achieving accuracy rates of 89.65% and 97.40%, respectively. These results demonstrate the effectiveness of our model, particularly in detecting early-stage DR, highlighting its potential for improving automated retinal screening in clinical settings.
Related papers
- VR-FuseNet: A Fusion of Heterogeneous Fundus Data and Explainable Deep Network for Diabetic Retinopathy Classification [0.0]
This paper presents a comprehensive approach for automated diabetic retinopathy detection by proposing a new hybrid deep learning model called VR-FuseNet.
The proposed VR-FuseNet model combines the strengths of two state-of-the-art convolutional neural networks, VGG19 which captures fine-grained spatial features and ResNet50V2 which is known for its deep hierarchical feature extraction.
The model outperforms individual architectures on all performance metrics demonstrating the effectiveness of hybrid feature extraction in Diabetic Retinopathy classification tasks.
arXiv Detail & Related papers (2025-04-30T09:38:47Z) - Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections [50.343419243749054]
Anomaly Detection (AD) involves identifying deviations from normal data distributions.<n>We propose a novel approach that conditions the prompts of the text encoder based on image context extracted from the vision encoder.<n>Our method achieves state-of-the-art performance, improving performance by 2% to 29% across different metrics on 14 datasets.
arXiv Detail & Related papers (2025-04-15T10:42:25Z) - Comprehensive Evaluation of OCT-based Automated Segmentation of Retinal Layer, Fluid and Hyper-Reflective Foci: Impact on Diabetic Retinopathy Severity Assessment [0.0]
Diabetic retinopathy (DR) is a leading cause of vision loss, requiring early and accurate assessment to prevent irreversible damage.<n>This study proposes an active-learning-based deep learning pipeline for automated segmentation of retinal layers.
arXiv Detail & Related papers (2025-03-03T07:23:56Z) - DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining [1.8218878957822688]
glaucoma is a progressive optic neuropathy characterized by structural damage to the optic nerve head and functional changes in the visual field.<n>We use a generative-based framework to enhance glaucoma diagnosis, specifically addressing class imbalance through synthetic data generation.
arXiv Detail & Related papers (2024-12-04T17:39:44Z) - Cross Feature Fusion of Fundus Image and Generated Lesion Map for Referable Diabetic Retinopathy Classification [1.091626241764448]
Diabetic Retinopathy (DR) is a primary cause of blindness, necessitating early detection and diagnosis.
We develop an advanced cross-learning DR classification method leveraging transfer learning and cross-attention mechanisms.
Our experiments, utilizing two public datasets, demonstrate a superior accuracy of 94.6%, surpassing current state-of-the-art methods by 4.4%.
arXiv Detail & Related papers (2024-11-06T02:23:38Z) - Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy [0.0]
This paper proposes a framework for controllably generating high-fidelity and diverse DR fundus images.
We achieve comprehensive control over DR severity and visual features within generated images.
We manipulate the DR images generated conditionally on grades, further enhancing the dataset diversity.
arXiv Detail & Related papers (2024-09-11T17:08:28Z) - Leveraging Semi-Supervised Graph Learning for Enhanced Diabetic
Retinopathy Detection [0.0]
Diabetic Retinopathy (DR) is a significant cause of blindness globally, highlighting the urgent need for early detection and effective treatment.
Recent advancements in Machine Learning (ML) techniques have shown promise in DR detection, but the availability of labeled data often limits their performance.
This research proposes a novel Semi-Supervised Graph Learning SSGL algorithm tailored for DR detection.
arXiv Detail & Related papers (2023-09-02T04:42:08Z) - Automatic diagnosis of knee osteoarthritis severity using Swin
transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint.
We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.