A Hybrid Framework Bridging CNN and ViT based on Theory of Evidence for Diabetic Retinopathy Grading
- URL: http://arxiv.org/abs/2510.26315v1
- Date: Thu, 30 Oct 2025 10:08:06 GMT
- Title: A Hybrid Framework Bridging CNN and ViT based on Theory of Evidence for Diabetic Retinopathy Grading
- Authors: Junlai Qiu, Yunzhu Chen, Hao Zheng, Yawen Huang, Yuexiang Li,
- Abstract summary: Diabetic retinopathy (DR) is a leading cause of vision loss among middle-aged and elderly people.<n>A variety of automated DR diagnosis systems have been recently established based on convolutional neural network (CNN) or vision Transformer (ViT)
- Score: 31.032002367721947
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Diabetic retinopathy (DR) is a leading cause of vision loss among middle-aged and elderly people, which significantly impacts their daily lives and mental health. To improve the efficiency of clinical screening and enable the early detection of DR, a variety of automated DR diagnosis systems have been recently established based on convolutional neural network (CNN) or vision Transformer (ViT). However, due to the own shortages of CNN / ViT, the performance of existing methods using single-type backbone has reached a bottleneck. One potential way for the further improvements is integrating different kinds of backbones, which can fully leverage the respective strengths of them (\emph{i.e.,} the local feature extraction capability of CNN and the global feature capturing ability of ViT). To this end, we propose a novel paradigm to effectively fuse the features extracted by different backbones based on the theory of evidence. Specifically, the proposed evidential fusion paradigm transforms the features from different backbones into supporting evidences via a set of deep evidential networks. With the supporting evidences, the aggregated opinion can be accordingly formed, which can be used to adaptively tune the fusion pattern between different backbones and accordingly boost the performance of our hybrid model. We evaluated our method on two publicly available DR grading datasets. The experimental results demonstrate that our hybrid model not only improves the accuracy of DR grading, compared to the state-of-the-art frameworks, but also provides the excellent interpretability for feature fusion and decision-making.
Related papers
- MedAD-R1: Eliciting Consistent Reasoning in Interpretible Medical Anomaly Detection via Consistency-Reinforced Policy Optimization [46.65200216642429]
We introduce MedAD-38K, the first large-scale, multi-modal, and multi-center benchmark for MedAD featuring diagnostic Chain-of-Thought (CoT) annotations alongside structured Visual Question-Answering (VQA) pairs.<n>Our proposed model, MedAD-R1, achieves state-of-the-art (SOTA) performance on the MedAD-38K benchmark, outperforming strong baselines by more than 10%.
arXiv Detail & Related papers (2026-02-01T07:56:10Z) - A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - AdvBlur: Adversarial Blur for Robust Diabetic Retinopathy Classification and Cross-Domain Generalization [0.0]
Diabetic retinopathy (DR) is a leading cause of vision loss worldwide.<n>Deep learning (DL) models have been developed to predict DR from fundus images.<n>Many face challenges in maintaining robustness due to distributional variations caused by differences in acquisition devices, demographic disparities, and imaging conditions.<n>This paper proposes a novel DR classification approach, a method called AdvBlur.
arXiv Detail & Related papers (2025-10-28T02:10:54Z) - Design and Validation of a Responsible Artificial Intelligence-based System for the Referral of Diabetic Retinopathy Patients [65.57160385098935]
Early detection of Diabetic Retinopathy can reduce the risk of vision loss by up to 95%.<n>We developed RAIS-DR, a Responsible AI System for DR screening that incorporates ethical principles across the AI lifecycle.<n>We evaluated RAIS-DR against the FDA-approved EyeArt system on a local dataset of 1,046 patients, unseen by both systems.
arXiv Detail & Related papers (2025-08-17T21:54:11Z) - VR-FuseNet: A Fusion of Heterogeneous Fundus Data and Explainable Deep Network for Diabetic Retinopathy Classification [0.0]
This paper presents a comprehensive approach for automated diabetic retinopathy detection by proposing a new hybrid deep learning model called VR-FuseNet.<n>The proposed VR-FuseNet model combines the strengths of two state-of-the-art convolutional neural networks, VGG19 which captures fine-grained spatial features and ResNet50V2 which is known for its deep hierarchical feature extraction.<n>The model outperforms individual architectures on all performance metrics demonstrating the effectiveness of hybrid feature extraction in Diabetic Retinopathy classification tasks.
arXiv Detail & Related papers (2025-04-30T09:38:47Z) - A Hybrid Fully Convolutional CNN-Transformer Model for Inherently Interpretable Disease Detection from Retinal Fundus Images [5.079347310071244]
We introduce an interpretable-by-design hybrid fully convolutional CNN-Transformer architecture for retinal disease detection.<n>Our model achieves state-of-the-art predictive performance compared to black-box and interpretable models.
arXiv Detail & Related papers (2025-04-11T12:15:22Z) - DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation [17.396365010722423]
Retinal vascular morphology is crucial for diagnosing diseases such as diabetes, glaucoma, and hypertension.<n>Traditional segmentation methods assume that training and testing data share similar distributions, which can lead to poor performance on unseen domains.<n>This paper presents a novel approach, DGSSA, for retinal vessel image segmentation that enhances model generalization by combining structural and style augmentation strategies.
arXiv Detail & Related papers (2025-01-07T01:47:57Z) - Generalizing to Unseen Domains in Diabetic Retinopathy Classification [8.59772105902647]
We study the problem of generalizing a model to unseen distributions or domains in diabetic retinopathy classification.
We propose a simple and effective domain generalization (DG) approach that achieves self-distillation in vision transformers.
We report the performance of several state-of-the-art DG methods on open-source DR classification datasets.
arXiv Detail & Related papers (2023-10-26T09:11:55Z) - Brain Imaging-to-Graph Generation using Adversarial Hierarchical Diffusion Models for MCI Causality Analysis [44.45598796591008]
Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis.
The hierarchical transformers in the generator are designed to estimate the noise at multiple scales.
Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
arXiv Detail & Related papers (2023-05-18T06:54:56Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - A Robust Backpropagation-Free Framework for Images [47.97322346441165]
We present an error kernel driven activation alignment algorithm for image data.
EKDAA accomplishes through the introduction of locally derived error transmission kernels and error maps.
Results are presented for an EKDAA trained CNN that employs a non-differentiable activation function.
arXiv Detail & Related papers (2022-06-03T21:14:10Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.