Cross-Fundus Transformer for Multi-modal Diabetic Retinopathy Grading with Cataract
- URL: http://arxiv.org/abs/2411.00726v1
- Date: Fri, 01 Nov 2024 16:38:49 GMT
- Title: Cross-Fundus Transformer for Multi-modal Diabetic Retinopathy Grading with Cataract
- Authors: Fan Xiao, Junlin Hou, Ruiwei Zhao, Rui Feng, Haidong Zou, Lina Lu, Yi Xu, Juzhao Zhang,
- Abstract summary: Diabetic retinopathy (DR) is a leading cause of blindness worldwide and a common complication of diabetes.
This study explores a novel multi-modal deep learning framework to fuse the information from color fundus photography (IFP) and infrared fundus photography (IFP) towards more accurate DR grading.
- Score: 17.77175890577782
- License:
- Abstract: Diabetic retinopathy (DR) is a leading cause of blindness worldwide and a common complication of diabetes. As two different imaging tools for DR grading, color fundus photography (CFP) and infrared fundus photography (IFP) are highly-correlated and complementary in clinical applications. To the best of our knowledge, this is the first study that explores a novel multi-modal deep learning framework to fuse the information from CFP and IFP towards more accurate DR grading. Specifically, we construct a dual-stream architecture Cross-Fundus Transformer (CFT) to fuse the ViT-based features of two fundus image modalities. In particular, a meticulously engineered Cross-Fundus Attention (CFA) module is introduced to capture the correspondence between CFP and IFP images. Moreover, we adopt both the single-modality and multi-modality supervisions to maximize the overall performance for DR grading. Extensive experiments on a clinical dataset consisting of 1,713 pairs of multi-modal fundus images demonstrate the superiority of our proposed method. Our code will be released for public access.
Related papers
- MultiEYE: Dataset and Benchmark for OCT-Enhanced Retinal Disease Recognition from Fundus Images [4.885485496458059]
We present the first large multi-modal multi-class dataset for eye disease diagnosis, MultiEYE.
We propose an OCT-assisted Conceptual Distillation Approach ( OCT-CoDA) to extract disease-related knowledge from OCT images.
Our proposed OCT-CoDA demonstrates remarkable results and interpretability, showing great potential for clinical application.
arXiv Detail & Related papers (2024-12-12T16:08:43Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Improved Automatic Diabetic Retinopathy Severity Classification Using
Deep Multimodal Fusion of UWF-CFP and OCTA Images [1.6449510885987357]
Diabetic Retinopathy (DR), a prevalent and severe complication of diabetes, affects millions of individuals globally.
Recent advancements in imaging technologies provide opportunities for the early detection of DR but also pose significant challenges.
This study introduces a novel multimodal approach that leverages these imaging modalities to notably enhance DR classification.
arXiv Detail & Related papers (2023-10-03T09:35:38Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical
Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.
Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Cross-Field Transformer for Diabetic Retinopathy Grading on Two-feld
Fundus Images [9.211425049275798]
We first construct a new benchmark dataset (DRTiD) for DR grading, consisting of 3,100 two-feld fundus images.
Then, we propose a novel DR grading approach, namely Cross-Field Transformer (CrossFiT) to capture the correspondence between two felds.
arXiv Detail & Related papers (2022-11-26T12:39:57Z) - Affinity Feature Strengthening for Accurate, Complete and Robust Vessel
Segmentation [48.638327652506284]
Vessel segmentation is crucial in many medical image applications, such as detecting coronary stenoses, retinal vessel diseases and brain aneurysms.
We present a novel approach, the affinity feature strengthening network (AFN), which jointly models geometry and refines pixel-wise segmentation features using a contrast-insensitive, multiscale affinity approach.
arXiv Detail & Related papers (2022-11-12T05:39:17Z) - Deep-OCTA: Ensemble Deep Learning Approaches for Diabetic Retinopathy
Analysis on OCTA Images [10.16138081361263]
We present novel and practical deep-learning solutions based on ultra-wide OCTA for the Diabetic Retinopathy Analysis Challenge (DRAC)
In the segmentation of DR lesions task, we utilize UNet and UNet++ to segment three lesions with strong data augmentation and model ensemble.
In the image quality assessment task, we create an ensemble of InceptionV3, SE-ResNeXt, and Vision Transformer models.
arXiv Detail & Related papers (2022-10-02T13:23:56Z) - Model-Guided Multi-Contrast Deep Unfolding Network for MRI
Super-resolution Reconstruction [68.80715727288514]
We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix.
In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction.
arXiv Detail & Related papers (2022-09-15T03:58:30Z) - Multi-Modal Multi-Instance Learning for Retinal Disease Recognition [10.294738095942812]
We aim to build a deep neural network that recognizes multiple vision-threatening diseases for the given case.
As both data acquisition and manual labeling are extremely expensive in the medical domain, the network has to be relatively lightweight.
arXiv Detail & Related papers (2021-09-25T08:16:47Z) - A Benchmark for Studying Diabetic Retinopathy: Segmentation, Grading,
and Transferability [76.64661091980531]
People with diabetes are at risk of developing diabetic retinopathy (DR)
Computer-aided DR diagnosis is a promising tool for early detection of DR and severity grading.
This dataset has 1,842 images with pixel-level DR-related lesion annotations, and 1,000 images with image-level labels graded by six board-certified ophthalmologists.
arXiv Detail & Related papers (2020-08-22T07:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.