Retinal Fundus Multi-Disease Image Classification using Hybrid CNN-Transformer-Ensemble Architectures
- URL: http://arxiv.org/abs/2503.21465v1
- Date: Thu, 27 Mar 2025 12:55:07 GMT
- Title: Retinal Fundus Multi-Disease Image Classification using Hybrid CNN-Transformer-Ensemble Architectures
- Authors: Deependra Singh, Saksham Agarwal, Subhankar Mishra,
- Abstract summary: Our research is motivated by the urgent global issue of a large population affected by retinal diseases.<n>Our primary objective is to develop a comprehensive diagnostic system capable of accurately predicting retinal diseases.
- Score: 0.3277163122167434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our research is motivated by the urgent global issue of a large population affected by retinal diseases, which are evenly distributed but underserved by specialized medical expertise, particularly in non-urban areas. Our primary objective is to bridge this healthcare gap by developing a comprehensive diagnostic system capable of accurately predicting retinal diseases solely from fundus images. However, we faced significant challenges due to limited, diverse datasets and imbalanced class distributions. To overcome these issues, we have devised innovative strategies. Our research introduces novel approaches, utilizing hybrid models combining deeper Convolutional Neural Networks (CNNs), Transformer encoders, and ensemble architectures sequentially and in parallel to classify retinal fundus images into 20 disease labels. Our overarching goal is to assess these advanced models' potential in practical applications, with a strong focus on enhancing retinal disease diagnosis accuracy across a broader spectrum of conditions. Importantly, our efforts have surpassed baseline model results, with the C-Tran ensemble model emerging as the leader, achieving a remarkable model score of 0.9166, surpassing the baseline score of 0.9. Additionally, experiments with the IEViT model showcased equally promising outcomes with improved computational efficiency. We've also demonstrated the effectiveness of dynamic patch extraction and the integration of domain knowledge in computer vision tasks. In summary, our research strives to contribute significantly to retinal disease diagnosis, addressing the critical need for accessible healthcare solutions in underserved regions while aiming for comprehensive and accurate disease prediction.
Related papers
- FundusGAN: A Hierarchical Feature-Aware Generative Framework for High-Fidelity Fundus Image Generation [35.46876389599076]
FundusGAN is a novel hierarchical feature-aware generative framework specifically designed for high-fidelity fundus image synthesis.<n>We show that FundusGAN consistently outperforms state-of-the-art methods across multiple metrics.
arXiv Detail & Related papers (2025-03-22T18:08:07Z) - Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - Hybrid Interpretable Deep Learning Framework for Skin Cancer Diagnosis: Integrating Radial Basis Function Networks with Explainable AI [1.1049608786515839]
Skin cancer is one of the most prevalent and potentially life-threatening diseases worldwide.
We propose a novel hybrid deep learning framework that integrates convolutional neural networks (CNNs) with Radial Basis Function (RBF) Networks to achieve high classification accuracy and enhanced interpretability.
arXiv Detail & Related papers (2025-01-24T19:19:02Z) - EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis [7.884451100342276]
EyeDiff is a text-to-image model designed to generate multimodal ophthalmic images from natural language prompts.
EyeDiff is trained on eight large-scale datasets and is adapted to ten multi-country external datasets.
arXiv Detail & Related papers (2024-11-15T07:30:53Z) - Enhance Eye Disease Detection using Learnable Probabilistic Discrete Latents in Machine Learning Architectures [1.6000489723889526]
Ocular diseases, including diabetic retinopathy and glaucoma, present a significant public health challenge.<n>Deep learning models have emerged as powerful tools for analysing medical images, such as retina imaging.<n>Challenges persist in model relibability and uncertainty estimation, which are critical for clinical decision-making.
arXiv Detail & Related papers (2024-01-21T04:14:54Z) - Polar-Net: A Clinical-Friendly Model for Alzheimer's Disease Detection
in OCTA Images [53.235117594102675]
Optical Coherence Tomography Angiography is a promising tool for detecting Alzheimer's disease (AD) by imaging the retinal microvasculature.
We propose a novel deep-learning framework called Polar-Net to provide interpretable results and leverage clinical prior knowledge.
We show that Polar-Net outperforms existing state-of-the-art methods and provides more valuable pathological evidence for the association between retinal vascular changes and AD.
arXiv Detail & Related papers (2023-11-10T11:49:49Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - DRAC: Diabetic Retinopathy Analysis Challenge with Ultra-Wide Optical
Coherence Tomography Angiography Images [51.27125547308154]
We organized a challenge named "DRAC - Diabetic Retinopathy Analysis Challenge" in conjunction with the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022)
The challenge consists of three tasks: segmentation of DR lesions, image quality assessment and DR grading.
This paper presents a summary and analysis of the top-performing solutions and results for each task of the challenge.
arXiv Detail & Related papers (2023-04-05T12:04:55Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - Improving Robustness using Joint Attention Network For Detecting Retinal
Degeneration From Optical Coherence Tomography Images [0.0]
We propose the use of disease-specific feature representation as a novel architecture comprised of two joint networks.
Our experimental results on publicly available datasets show the proposed joint-network significantly improves the accuracy and robustness of state-of-the-art retinal disease classification networks on unseen datasets.
arXiv Detail & Related papers (2020-05-16T20:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.