Related papers: Disentanglement and Assessment of Shortcuts in Ophthalmological Retinal Imaging Exams

Disentanglement and Assessment of Shortcuts in Ophthalmological Retinal Imaging Exams

URL: http://arxiv.org/abs/2507.09640v1
Date: Sun, 13 Jul 2025 14:11:41 GMT
Title: Disentanglement and Assessment of Shortcuts in Ophthalmological Retinal Imaging Exams
Authors: Leonor Fernandes, Tiago Gonçalves, João Matos, Luis Filipe Nakayama, Jaime S. Cardoso,
Abstract summary: Diabetic retinopathy (DR) is a leading cause of vision loss in working-age adults.<n>While screening reduces the risk of blindness, traditional imaging is often costly and inaccessible.<n>This work evaluates the fairness and performance of image-trained models in DR prediction.
Score: 2.691914632256091
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Diabetic retinopathy (DR) is a leading cause of vision loss in working-age adults. While screening reduces the risk of blindness, traditional imaging is often costly and inaccessible. Artificial intelligence (AI) algorithms present a scalable diagnostic solution, but concerns regarding fairness and generalization persist. This work evaluates the fairness and performance of image-trained models in DR prediction, as well as the impact of disentanglement as a bias mitigation technique, using the diverse mBRSET fundus dataset. Three models, ConvNeXt V2, DINOv2, and Swin V2, were trained on macula images to predict DR and sensitive attributes (SAs) (e.g., age and gender/sex). Fairness was assessed between subgroups of SAs, and disentanglement was applied to reduce bias. All models achieved high DR prediction performance in diagnosing (up to 94% AUROC) and could reasonably predict age and gender/sex (91% and 77% AUROC, respectively). Fairness assessment suggests disparities, such as a 10% AUROC gap between age groups in DINOv2. Disentangling SAs from DR prediction had varying results, depending on the model selected. Disentanglement improved DINOv2 performance (2% AUROC gain), but led to performance drops in ConvNeXt V2 and Swin V2 (7% and 3%, respectively). These findings highlight the complexity of disentangling fine-grained features in fundus imaging and emphasize the importance of fairness in medical imaging AI to ensure equitable and reliable healthcare solutions.

Related papers

Patch-Based and Non-Patch-Based inputs Comparison into Deep Neural Models: Application for the Segmentation of Retinal Diseases on Optical Coherence Tomography Volumes [0.3749861135832073]
Approaching 170 million persons wide-ranging have been spotted with AMD, a figure anticipated to rise to 288 million by 2040.<n>Deep learning networks have shown promising results in both image and pixel-level 2D scan classification.<n>Highest score for a patch-based model in the DSC metric was 0.88 in comparison to the score of 0.71 for the same model in non-patch-based for SRF fluid segmentation.
arXiv Detail & Related papers (2025-01-22T10:22:08Z)
Evaluating General Purpose Vision Foundation Models for Medical Image Analysis: An Experimental Study of DINOv2 on Radiology Benchmarks [5.8941124219471055]
DINOv2 is an open-source foundation model pre-trained with self-supervised learning on 142 million curated natural images. This study comprehensively evaluates the performance DINOv2 for radiology.
arXiv Detail & Related papers (2023-12-04T21:47:10Z)
DRAC: Diabetic Retinopathy Analysis Challenge with Ultra-Wide Optical Coherence Tomography Angiography Images [51.27125547308154]
We organized a challenge named "DRAC - Diabetic Retinopathy Analysis Challenge" in conjunction with the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022) The challenge consists of three tasks: segmentation of DR lesions, image quality assessment and DR grading. This paper presents a summary and analysis of the top-performing solutions and results for each task of the challenge.
arXiv Detail & Related papers (2023-04-05T12:04:55Z)
GraVIS: Grouping Augmented Views from Independent Sources for Dermatology Analysis [52.04899592688968]
We propose GraVIS, which is specifically optimized for learning self-supervised features from dermatology images. GraVIS significantly outperforms its transfer learning and self-supervised learning counterparts in both lesion segmentation and disease classification tasks.
arXiv Detail & Related papers (2023-01-11T11:38:37Z)
An Ensemble Method to Automatically Grade Diabetic Retinopathy with Optical Coherence Tomography Angiography Images [4.640835690336653]
We propose an ensemble method to automatically grade Diabetic retinopathy (DR) images available from Diabetic Retinopathy Analysis Challenge (DRAC) 2022. First, we adopt the state-of-the-art classification networks, and train them to grade UW- OCTA images with different splits of the available dataset. Ultimately, we obtain 25 models, of which, the top 16 models are selected and ensembled to generate the final predictions.
arXiv Detail & Related papers (2022-12-12T22:06:47Z)
Automated SSIM Regression for Detection and Quantification of Motion Artefacts in Brain MR Images [54.739076152240024]
Motion artefacts in magnetic resonance brain images are a crucial issue. The assessment of MR image quality is fundamental before proceeding with the clinical diagnosis. An automated image quality assessment based on the structural similarity index (SSIM) regression has been proposed here.
arXiv Detail & Related papers (2022-06-14T10:16:54Z)
Visual Acuity Prediction on Real-Life Patient Data Using a Machine Learning Based Multistage System [0.40151799356083057]
The prediction of the visual acuity (VA) and the earliest possible detection of deterioration under real-life conditions is challenging due to heterogeneous and incomplete data. We present a workflow for the development of a research-compatible data corpus fusing different IT systems of the department of ophthalmology of a German maximum care hospital. We achieve a final prediction accuracy of 69 % in macro average F1-score, while being in the same range as the ophthalmologists with 57.8 and 50 +- 10.7 % F1-score.
arXiv Detail & Related papers (2022-04-25T21:20:27Z)
Performance or Trust? Why Not Both. Deep AUC Maximization with Self-Supervised Learning for COVID-19 Chest X-ray Classifications [72.52228843498193]
In training deep learning models, a compromise often must be made between performance and trust. In this work, we integrate a new surrogate loss with self-supervised learning for computer-aided screening of COVID-19 patients.
arXiv Detail & Related papers (2021-12-14T21:16:52Z)
On the Robustness of Pretraining and Self-Supervision for a Deep Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading. We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions. Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z)
An Interpretable Multiple-Instance Approach for the Detection of referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images. By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy. We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z)
Predictive Analysis of Diabetic Retinopathy with Transfer Learning [0.0]
This paper studies the performance of CNN architectures for Diabetic Retinopathy Classification with the help of Transfer Learning. The results indicate that Transfer Learning with ImageNet weights using VGG 16 model demonstrates the best classification performance with the best Accuracy of 95%.
arXiv Detail & Related papers (2020-11-08T18:54:57Z)
A Benchmark for Studying Diabetic Retinopathy: Segmentation, Grading, and Transferability [76.64661091980531]
People with diabetes are at risk of developing diabetic retinopathy (DR) Computer-aided DR diagnosis is a promising tool for early detection of DR and severity grading. This dataset has 1,842 images with pixel-level DR-related lesion annotations, and 1,000 images with image-level labels graded by six board-certified ophthalmologists.
arXiv Detail & Related papers (2020-08-22T07:48:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.