Evaluating Fundus-Specific Foundation Models for Diabetic Macular Edema Detection
- URL: http://arxiv.org/abs/2510.07277v1
- Date: Wed, 08 Oct 2025 17:41:02 GMT
- Title: Evaluating Fundus-Specific Foundation Models for Diabetic Macular Edema Detection
- Authors: Franco Javier Arellano, José Ignacio Orlando,
- Abstract summary: Diabetic Macular Edema (DME) is a leading cause of vision loss among patients with Diabetic Retinopathy (DR)<n>Deep learning has shown promising results for automatically detecting this condition from fundus images.<n>It is unclear if Foundation Models (FM) can cope with DME detection in particular.
- Score: 0.19514194744184568
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diabetic Macular Edema (DME) is a leading cause of vision loss among patients with Diabetic Retinopathy (DR). While deep learning has shown promising results for automatically detecting this condition from fundus images, its application remains challenging due the limited availability of annotated data. Foundation Models (FM) have emerged as an alternative solution. However, it is unclear if they can cope with DME detection in particular. In this paper, we systematically compare different FM and standard transfer learning approaches for this task. Specifically, we compare the two most popular FM for retinal images--RETFound and FLAIR--and an EfficientNet-B0 backbone, across different training regimes and evaluation settings in IDRiD, MESSIDOR-2 and OCT-and-Eye-Fundus-Images (OEFI). Results show that despite their scale, FM do not consistently outperform fine-tuned CNNs in this task. In particular, an EfficientNet-B0 ranked first or second in terms of area under the ROC and precision/recall curves in most evaluation settings, with RETFound only showing promising results in OEFI. FLAIR, on the other hand, demonstrated competitive zero-shot performance, achieving notable AUC-PR scores when prompted appropriately. These findings reveal that FM might not be a good tool for fine-grained ophthalmic tasks such as DME detection even after fine-tuning, suggesting that lightweight CNNs remain strong baselines in data-scarce environments.
Related papers
- Are Vision Foundation Models Foundational for Electron Microscopy Image Segmentation? [0.6302369456012739]
We study the problem of mitochondria electron microscopy (EM) images using two popular public datasets (Lucchi++ and OpenCLIP)<n>We observe that training on a single EM dataset yields good segmentation performance (quantified as foreground Intersection-over-Union)<n>Training on multiple EM datasets leads to severe performance degradation for all models considered.
arXiv Detail & Related papers (2026-02-09T10:55:18Z) - FusionFM: Fusing Eye-specific Foundational Models for Optimized Ophthalmic Diagnosis [36.79693801937608]
Foundation models (FMs) have shown great promise in medical image analysis by improving generalization across diverse downstream tasks.<n>To our knowledge, this is the first study to systematically evaluate both single and fused ophthalmic FMs.<n>We benchmarked four state-of-the-art FMs using standardized datasets from multiple countries and evaluated their performance using AUC and F1 metrics.
arXiv Detail & Related papers (2025-08-15T01:17:52Z) - Benchmarking Foundation Models and Parameter-Efficient Fine-Tuning for Prognosis Prediction in Medical Imaging [40.35825564674249]
This study introduces the first structured benchmark to assess the robustness and efficiency of transfer learning strategies for Foundation Models.<n>Four publicly available COVID-19 chest X-ray datasets were used, covering mortality, severity, and admission.<n>CNNs pretrained on ImageNet and FMs pretrained on general or biomedical datasets were adapted using full finetuning, linear probing, and parameter-efficient methods.
arXiv Detail & Related papers (2025-06-23T09:16:04Z) - Is an Ultra Large Natural Image-Based Foundation Model Superior to a Retina-Specific Model for Detecting Ocular and Systemic Diseases? [19.8132297355024]
RETFound and DINOv2 models were evaluated for ocular disease detection and systemic disease prediction tasks.<n> RETFound achieved superior performance over all DINOv2 models in predicting heart failure, infarction, and ischaemic stroke.
arXiv Detail & Related papers (2025-02-10T09:31:39Z) - Deep Learning Ensemble for Predicting Diabetic Macular Edema Onset Using Ultra-Wide Field Color Fundus Image [2.9945018168793025]
Diabetic macular edema (DME) is a severe complication of diabetes.<n>We propose an ensemble method to predict ci-DME onset within a year.
arXiv Detail & Related papers (2024-10-09T02:16:29Z) - Improving Diffusion Models for ECG Imputation with an Augmented Template
Prior [43.6099225257178]
noisy and poor-quality recordings are a major issue for signals collected using mobile health systems.
Recent studies have explored the imputation of missing values in ECG with probabilistic time-series models.
We present a template-guided denoising diffusion probabilistic model (DDPM), PulseDiff, which is conditioned on an informative prior for a range of health conditions.
arXiv Detail & Related papers (2023-10-24T11:34:15Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - A Deep Learning-Based Unified Framework for Red Lesions Detection on Retinal Fundus Images [0.5018156030818883]
Red-lesions, microaneurysms (MAs) and hemorrhages (HMs) are the early signs of diabetic retinopathy (DR)<n>Most of the existing methods detect either only MAs or only HMs because of the difference in their texture, sizes, and morphology.<n>We propose a two-stream red lesions detection system dealing simultaneously with small and large red lesions.
arXiv Detail & Related papers (2021-09-10T00:12:13Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Federated Deep AUC Maximization for Heterogeneous Data with a Constant
Communication Complexity [77.78624443410216]
We propose improved FDAM algorithms for detecting heterogeneous chest data.
A result of this paper is that the communication of the proposed algorithm is strongly independent of the number of machines and also independent of the accuracy level.
Experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets and on medical chest Xray images from different organizations.
arXiv Detail & Related papers (2021-02-09T04:05:19Z) - Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies
on Medical Image Classification [63.44396343014749]
We propose a new margin-based surrogate loss function for the AUC score.
It is more robust than the commonly used.
square loss while enjoying the same advantage in terms of large-scale optimization.
To the best of our knowledge, this is the first work that makes DAM succeed on large-scale medical image datasets.
arXiv Detail & Related papers (2020-12-06T03:41:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.