ELF: An End-to-end Local and Global Multimodal Fusion Framework for
Glaucoma Grading
- URL: http://arxiv.org/abs/2311.08032v1
- Date: Tue, 14 Nov 2023 09:51:00 GMT
- Title: ELF: An End-to-end Local and Global Multimodal Fusion Framework for
Glaucoma Grading
- Authors: Wenyun Li and Chi-Man Pun
- Abstract summary: We propose an end-to-end local and global multi-modal fusion framework for glaucoma grading named ELF.
ELF can fully utilize the complementary information between fundus and OCT.
The extensive experiment conducted on the multi-modal glaucoma grading GAMMA dataset can prove the effiectness of ELF.
- Score: 43.12236694270165
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Glaucoma is a chronic neurodegenerative condition that can lead to blindness.
Early detection and curing are very important in stopping the disease from
getting worse for glaucoma patients. The 2D fundus images and optical coherence
tomography(OCT) are useful for ophthalmologists in diagnosing glaucoma. There
are many methods based on the fundus images or 3D OCT volumes; however, the
mining for multi-modality, including both fundus images and data, is less
studied. In this work, we propose an end-to-end local and global multi-modal
fusion framework for glaucoma grading, named ELF for short. ELF can fully
utilize the complementary information between fundus and OCT. In addition,
unlike previous methods that concatenate the multi-modal features together,
which lack exploring the mutual information between different modalities, ELF
can take advantage of local-wise and global-wise mutual information. The
extensive experiment conducted on the multi-modal glaucoma grading GAMMA
dataset can prove the effiectness of ELF when compared with other
state-of-the-art methods.
Related papers
- MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models [49.765466293296186]
Recent progress in Medical Large Vision-Language Models (Med-LVLMs) has opened up new possibilities for interactive diagnostic tools.
Med-LVLMs often suffer from factual hallucination, which can lead to incorrect diagnoses.
We propose a versatile multimodal RAG system, MMed-RAG, designed to enhance the factuality of Med-LVLMs.
arXiv Detail & Related papers (2024-10-16T23:03:27Z) - ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading [7.188153974946432]
Glaucoma is one of the leading causes of vision impairment.
It remains challenging to extract reliable features due to the high similarity of medical images and the unbalanced multi-modal data distribution.
We propose a novel framework, namely ETSCL, which consists of a contrastive feature extraction stage and a decision-level fusion stage.
arXiv Detail & Related papers (2024-07-19T11:57:56Z) - Fundus-Enhanced Disease-Aware Distillation Model for Retinal Disease
Classification from OCT Images [6.72159216082989]
We propose a fundus-enhanced disease-aware distillation model for retinal disease classification from OCT images.
Our framework enhances the OCT model during training by utilizing unpaired fundus images.
Our proposed approach outperforms single-modal, multi-modal, and state-of-the-art distillation methods for retinal disease classification.
arXiv Detail & Related papers (2023-08-01T05:13:02Z) - Reliable Multimodality Eye Disease Screening via Mixture of Student's t
Distributions [49.4545260500952]
We introduce a novel multimodality evidential fusion pipeline for eye disease screening, EyeMoSt.
Our model estimates both local uncertainty for unimodality and global uncertainty for the fusion modality to produce reliable classification results.
Our experimental findings on both public and in-house datasets show that our model is more reliable than current methods.
arXiv Detail & Related papers (2023-03-17T06:18:16Z) - Multimodal Information Fusion for Glaucoma and DR Classification [1.5616442980374279]
Multimodal information is frequently available in medical tasks. By combining information from multiple sources, clinicians are able to make more accurate judgments.
Our paper investigates three multimodal information fusion strategies based on deep learning to solve retinal analysis tasks.
arXiv Detail & Related papers (2022-09-02T12:19:03Z) - GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges [48.98620387924817]
We set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus & OCT-based glaucoma grading.
The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes.
We have publicly released a glaucoma annotated dataset with both 2D fundus color photography and 3D OCT volumes, which is the first multi-modality dataset for glaucoma grading.
arXiv Detail & Related papers (2022-02-14T06:54:15Z) - COROLLA: An Efficient Multi-Modality Fusion Framework with Supervised
Contrastive Learning for Glaucoma Grading [1.2250035750661867]
We propose an efficient multi-modality supervised contrastive learning framework, named COROLLA, for glaucoma grading.
We employ supervised contrastive learning to increase our models' discriminative power with better convergence.
On the GAMMA dataset, our COROLLA framework achieves overwhelming glaucoma grading performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-01-11T06:00:51Z) - Assessing glaucoma in retinal fundus photographs using Deep Feature
Consistent Variational Autoencoders [63.391402501241195]
glaucoma is challenging to detect since it remains asymptomatic until the symptoms are severe.
Early identification of glaucoma is generally made based on functional, structural, and clinical assessments.
Deep learning methods have partially solved this dilemma by bypassing the marker identification stage and analyzing high-level information directly to classify the data.
arXiv Detail & Related papers (2021-10-04T16:06:49Z) - Modeling and Enhancing Low-quality Retinal Fundus Images [167.02325845822276]
Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis.
We propose a clinically oriented fundus enhancement network (cofe-Net) to suppress global degradation factors.
Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details.
arXiv Detail & Related papers (2020-05-12T08:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.