DRetNet: A Novel Deep Learning Framework for Diabetic Retinopathy Diagnosis
- URL: http://arxiv.org/abs/2509.01072v1
- Date: Mon, 01 Sep 2025 02:27:16 GMT
- Title: DRetNet: A Novel Deep Learning Framework for Diabetic Retinopathy Diagnosis
- Authors: Idowu Paul Okuwobi, Jingyuan Liu, Jifeng Wan, Jiaojiao Jiang,
- Abstract summary: Current DR detection systems struggle with poor-quality images, lack interpretability, and insufficient integration of domain-specific knowledge.<n>We introduce a novel framework that integrates three innovative contributions.<n>The framework achieves an accuracy of 92.7%, a precision of 92.5%, a recall of 92.6%, an F1-score of 92.5%, an AUC of 97.8%, a mAP of 0.96, and an MCC of 0.85.
- Score: 8.234135343778993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diabetic retinopathy (DR) is a leading cause of blindness worldwide, necessitating early detection to prevent vision loss. Current automated DR detection systems often struggle with poor-quality images, lack interpretability, and insufficient integration of domain-specific knowledge. To address these challenges, we introduce a novel framework that integrates three innovative contributions: (1) Adaptive Retinal Image Enhancement Using Physics-Informed Neural Networks (PINNs): this technique dynamically enhances retinal images by incorporating physical constraints, improving the visibility of critical features such as microaneurysms, hemorrhages, and exudates; (2) Hybrid Feature Fusion Network (HFFN): by combining deep learning embeddings with handcrafted features, HFFN leverages both learned representations and domain-specific knowledge to enhance generalization and accuracy; (3) Multi-Stage Classifier with Uncertainty Quantification: this method breaks down the classification process into logical stages, providing interpretable predictions and confidence scores, thereby improving clinical trust. The proposed framework achieves an accuracy of 92.7%, a precision of 92.5%, a recall of 92.6%, an F1-score of 92.5%, an AUC of 97.8%, a mAP of 0.96, and an MCC of 0.85. Ophthalmologists rated the framework's predictions as highly clinically relevant (4.8/5), highlighting its alignment with real-world diagnostic needs. Qualitative analyses, including Grad-CAM visualizations and uncertainty heatmaps, further enhance the interpretability and trustworthiness of the system. The framework demonstrates robust performance across diverse conditions, including low-quality images, noisy data, and unseen datasets. These features make the proposed framework a promising tool for clinical adoption, enabling more accurate and reliable DR detection in resource-limited settings.
Related papers
- Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis [7.945705180020063]
We propose a knowledge-enhanced joint embedding framework that integrates retinal fundus images, clinical text, and structured patient data.<n>Our framework achieves near-perfect text-to-image retrieval performance with Recall@1 of 99.94% compared to fine-tuned CLIP's 1.29%.
arXiv Detail & Related papers (2025-12-22T18:41:45Z) - NEURO-GUARD: Neuro-Symbolic Generalization and Unbiased Adaptive Routing for Diagnostics -- Explainable Medical AI [0.6345042809319409]
We present NEURO-GUARD, a knowledge-guided vision framework that integrates Vision Transformers (ViTs) with language-driven reasoning to improve performance.<n> NEURO-GUARD employs a retrieval-augmented generation (RAG) mechanism for self-verification, in which a large language model (LLM) iteratively generates, evaluates, and refines feature-extraction code for medical images.<n>Experiments on diabetic retinopathy classification across four benchmark datasets demonstrate that NEURO-GUARD improves accuracy by 6.2% over a ViT-only baseline and achieves a 5% gain in domain generalization.
arXiv Detail & Related papers (2025-12-20T02:32:15Z) - A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - Ocular-Induced Abnormal Head Posture: Diagnosis and Missing Data Imputation [1.7061463565692456]
Ocular-induced abnormal head posture (AHP) is a compensatory mechanism that arises from ocular misalignment.<n>This study addresses both challenges through two complementary deep learning frameworks.<n>AHP-CADNet is a multi-level attention fusion framework for automated diagnosis.<n> curriculum learning-based imputation framework is designed to mitigate missing data.
arXiv Detail & Related papers (2025-10-07T07:51:59Z) - A Fully Automatic Framework for Intracranial Pressure Grading: Integrating Keyframe Identification, ONSD Measurement and Clinical Data [3.6652537579778106]
Intracranial pressure (ICP) elevation poses severe threats to cerebral function, thus necessitating monitoring for timely intervention.<n>We introduce a fully automatic two-stage framework for ICP grading, integrating ONSD measurement and clinical data.<n>Our method achieves a validation accuracy of $0.845 pm 0.071$ and an independent test accuracy of 0.786, significantly outperforming conventional threshold-based method.
arXiv Detail & Related papers (2025-09-11T11:37:48Z) - A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer [54.58205672910646]
RenalCLIP is a visual-language foundation model for characterization, diagnosis and prognosis of renal mass.<n>It achieved better performance and superior generalizability across 10 core tasks spanning the full clinical workflow of kidney cancer.
arXiv Detail & Related papers (2025-08-22T17:48:19Z) - A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler [49.03919553747297]
We propose an AI-powered, real-time CoW auto-segmentation system capable of efficiently capturing cerebral arteries.<n>No prior studies have explored AI-driven cerebrovascular segmentation using Transcranial Color-coded Doppler (TCCD)<n>The proposed AAW-YOLO demonstrated strong performance in segmenting both ipsilateral and contralateral CoW vessels.
arXiv Detail & Related papers (2025-08-19T14:41:22Z) - Design and Validation of a Responsible Artificial Intelligence-based System for the Referral of Diabetic Retinopathy Patients [65.57160385098935]
Early detection of Diabetic Retinopathy can reduce the risk of vision loss by up to 95%.<n>We developed RAIS-DR, a Responsible AI System for DR screening that incorporates ethical principles across the AI lifecycle.<n>We evaluated RAIS-DR against the FDA-approved EyeArt system on a local dataset of 1,046 patients, unseen by both systems.
arXiv Detail & Related papers (2025-08-17T21:54:11Z) - Addressing High Class Imbalance in Multi-Class Diabetic Retinopathy Severity Grading with Augmentation and Transfer Learning [1.5939351525664014]
This paper presents a robust deep learning framework for both binary and five-class Diabetic retinopathy (DR) classification.<n>For binary classification, our proposed model achieves a state-of-the-art accuracy of 98.9%, with a precision of 98.6%, recall of 99.3%, F1-score of 98.9%, and an AUC of 99.4%.<n>In the more challenging five-class severity classification task, our model obtains a competitive accuracy of 84.6% and an AUC of 94.1%, outperforming several existing approaches.
arXiv Detail & Related papers (2025-07-23T01:52:27Z) - Lightweight Relational Embedding in Task-Interpolated Few-Shot Networks for Enhanced Gastrointestinal Disease Classification [0.0]
Colon cancer detection is crucial for increasing patient survival rates.<n> colonoscopy is dependent on obtaining adequate and high-quality endoscopic images.<n>Few-Shot Learning architecture enables our model to rapidly adapt to unseen fine-grained endoscopic image patterns.<n>Our model demonstrated superior performance, achieving an accuracy of 90.1%, precision of 0.845, recall of 0.942, and an F1 score of 0.891.
arXiv Detail & Related papers (2025-05-30T16:54:51Z) - Metrics that matter: Evaluating image quality metrics for medical image generation [48.85783422900129]
This study comprehensively assesses commonly used no-reference image quality metrics using brain MRI data.<n>We evaluate metric sensitivity to a range of challenges, including noise, distribution shifts, and, critically, morphological alterations designed to mimic clinically relevant inaccuracies.
arXiv Detail & Related papers (2025-05-12T01:57:25Z) - Enhancing Diabetic Retinopathy Diagnosis: A Lightweight CNN Architecture for Efficient Exudate Detection in Retinal Fundus Images [0.0]
This paper introduces a novel, lightweight convolutional neural network architecture tailored for automated exudate detection.
We have incorporated domain-specific data augmentations to enhance the model's generalizability.
Our model achieves an impressive F1 score of 90%, demonstrating its efficacy in the early detection of diabetic retinopathy through fundus imaging.
arXiv Detail & Related papers (2024-08-13T10:13:33Z) - Spatial-aware Transformer-GRU Framework for Enhanced Glaucoma Diagnosis
from 3D OCT Imaging [1.8416014644193066]
We present a novel deep learning framework that leverages the diagnostic value of 3D Optical Coherence Tomography ( OCT) imaging for automated glaucoma detection.
We integrate a pre-trained Vision Transformer on retinal data for rich slice-wise feature extraction and a bidirectional Gated Recurrent Unit for capturing inter-slice spatial dependencies.
Experimental results on a large dataset demonstrate the superior performance of the proposed method over state-of-the-art ones.
arXiv Detail & Related papers (2024-03-08T22:25:15Z) - Confidence-Driven Deep Learning Framework for Early Detection of Knee Osteoarthritis [8.193689534916988]
Knee Osteoarthritis (KOA) is a prevalent musculoskeletal disorder that severely impacts mobility and quality of life.<n>We propose a confidence-driven deep learning framework for early KOA detection, focusing on distinguishing KL-0 and KL-2 stages.<n> Experimental results demonstrate that the proposed framework achieves competitive accuracy, sensitivity, and specificity, comparable to those of expert radiologists.
arXiv Detail & Related papers (2023-03-23T11:57:50Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.