EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis
- URL: http://arxiv.org/abs/2411.10004v1
- Date: Fri, 15 Nov 2024 07:30:53 GMT
- Title: EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis
- Authors: Ruoyu Chen, Weiyi Zhang, Bowen Liu, Xiaolan Chen, Pusheng Xu, Shunming Liu, Mingguang He, Danli Shi,
- Abstract summary: EyeDiff is a text-to-image model designed to generate multimodal ophthalmic images from natural language prompts.
EyeDiff is trained on eight large-scale datasets and is adapted to ten multi-country external datasets.
- Score: 7.884451100342276
- License:
- Abstract: The rising prevalence of vision-threatening retinal diseases poses a significant burden on the global healthcare systems. Deep learning (DL) offers a promising solution for automatic disease screening but demands substantial data. Collecting and labeling large volumes of ophthalmic images across various modalities encounters several real-world challenges, especially for rare diseases. Here, we introduce EyeDiff, a text-to-image model designed to generate multimodal ophthalmic images from natural language prompts and evaluate its applicability in diagnosing common and rare diseases. EyeDiff is trained on eight large-scale datasets using the advanced latent diffusion model, covering 14 ophthalmic image modalities and over 80 ocular diseases, and is adapted to ten multi-country external datasets. The generated images accurately capture essential lesional characteristics, achieving high alignment with text prompts as evaluated by objective metrics and human experts. Furthermore, integrating generated images significantly enhances the accuracy of detecting minority classes and rare eye diseases, surpassing traditional oversampling methods in addressing data imbalance. EyeDiff effectively tackles the issue of data imbalance and insufficiency typically encountered in rare diseases and addresses the challenges of collecting large-scale annotated images, offering a transformative solution to enhance the development of expert-level diseases diagnosis models in ophthalmic field.
Related papers
- EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis [20.318178211934985]
We propose EyeCLIP, a visual-language foundation model developed using over 2.77 million ophthalmology images with partial text data.
EyeCLIP can be transferred to a wide range of downstream tasks involving ocular and systemic diseases.
arXiv Detail & Related papers (2024-09-10T17:00:19Z) - A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks [0.0]
We developed a Fundus-Specific Pretrained Model (Image+Fundus), a supervised artificial intelligence model trained to detect abnormalities in fundus images.
A total of 57,803 images were used to develop this pretrained model, which achieved superior performance across various downstream tasks.
arXiv Detail & Related papers (2024-08-16T15:03:06Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging [13.88319807760491]
We present EyeFound, a multimodal foundation model for ophthalmic images.
It learns generalizable representations from unlabeled multimodal retinal images.
It is trained on 2.78 million images from 227 hospitals across 11 ophthalmic modalities.
arXiv Detail & Related papers (2024-05-18T17:03:39Z) - Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning [13.174267261284733]
Fundus diseases are major causes of visual impairment and blindness worldwide.
We propose a general self-supervised machine learning framework that can handle diverse fundus diseases from unlabeled fundus images.
arXiv Detail & Related papers (2024-04-20T14:15:25Z) - Reliable Multimodality Eye Disease Screening via Mixture of Student's t
Distributions [49.4545260500952]
We introduce a novel multimodality evidential fusion pipeline for eye disease screening, EyeMoSt.
Our model estimates both local uncertainty for unimodality and global uncertainty for the fusion modality to produce reliable classification results.
Our experimental findings on both public and in-house datasets show that our model is more reliable than current methods.
arXiv Detail & Related papers (2023-03-17T06:18:16Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - Assessing glaucoma in retinal fundus photographs using Deep Feature
Consistent Variational Autoencoders [63.391402501241195]
glaucoma is challenging to detect since it remains asymptomatic until the symptoms are severe.
Early identification of glaucoma is generally made based on functional, structural, and clinical assessments.
Deep learning methods have partially solved this dilemma by bypassing the marker identification stage and analyzing high-level information directly to classify the data.
arXiv Detail & Related papers (2021-10-04T16:06:49Z) - An Interpretable Multiple-Instance Approach for the Detection of
referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images.
By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy.
We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Improving Robustness using Joint Attention Network For Detecting Retinal
Degeneration From Optical Coherence Tomography Images [0.0]
We propose the use of disease-specific feature representation as a novel architecture comprised of two joint networks.
Our experimental results on publicly available datasets show the proposed joint-network significantly improves the accuracy and robustness of state-of-the-art retinal disease classification networks on unseen datasets.
arXiv Detail & Related papers (2020-05-16T20:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.