Related papers: Exploring the Transfer Learning Capabilities of CLIP in Domain Generalization for Diabetic Retinopathy

Exploring the Transfer Learning Capabilities of CLIP in Domain Generalization for Diabetic Retinopathy

URL: http://arxiv.org/abs/2308.14212v1
Date: Sun, 27 Aug 2023 22:02:41 GMT
Title: Exploring the Transfer Learning Capabilities of CLIP in Domain Generalization for Diabetic Retinopathy
Authors: Sanoojan Baliah, Fadillah A. Maani, Santosh Sanjeev and Muhammad Haris Khan
Abstract summary: Cross-domain generalization is a challenging problem in the medical domain. Recent studies have shown the effectiveness of using CLIP to handle the DG problem in natural images. In this study, we investigate CLIP's transfer learning capabilities and its potential for cross-domain generalization in diabetic retinopathy (DR) classification.
Score: 7.649900082537232
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diabetic Retinopathy (DR), a leading cause of vision impairment, requires early detection and treatment. Developing robust AI models for DR classification holds substantial potential, but a key challenge is ensuring their generalization in unfamiliar domains with varying data distributions. To address this, our paper investigates cross-domain generalization, also known as domain generalization (DG), within the context of DR classification. DG, a challenging problem in the medical domain, is complicated by the difficulty of gathering labeled data across different domains, such as patient demographics and disease stages. Some recent studies have shown the effectiveness of using CLIP to handle the DG problem in natural images. In this study, we investigate CLIP's transfer learning capabilities and its potential for cross-domain generalization in diabetic retinopathy (DR) classification. We carry out comprehensive experiments to assess the efficacy and potential of CLIP in addressing DG for DR classification. Further, we introduce a multi-modal fine-tuning strategy named Context Optimization with Learnable Visual Tokens (CoOpLVT), which enhances context optimization by conditioning on visual features. Our findings demonstrate that the proposed method increases the F1-score by 1.8% over the baseline, thus underlining its promise for effective DG in DR classification. Our code is publicly available at https://github.com/Sanoojan/CLIP-DRDG.

Related papers

Low-Rank Adaptive Structural Priors for Generalizable Diabetic Retinopathy Grading [3.4531529749205347]
We introduce Low-rank Adaptive Structural Priors (LoASP), a plug-and-play framework designed for seamless integration with existing deep learning models. LoASP improves generalization by learning adaptive structural representations that are finely tuned to the complexities of diabetic retinopathy diagnosis. visualizations reveal that the learned structural priors intuitively align with the intricate architecture of the vessels and lesions.
arXiv Detail & Related papers (2025-04-27T21:40:02Z)
Generative Classifier for Domain Generalization [84.92088101715116]
Domain generalization aims to the generalizability of computer vision models toward distribution shifts. We propose Generative-driven Domain Generalization (GCDG) GCDG consists of three key modules: Heterogeneity Learning(HLC), Spurious Correlation(SCB), and Diverse Component Balancing(DCB)
arXiv Detail & Related papers (2025-04-03T04:38:33Z)
Divergent Domains, Convergent Grading: Enhancing Generalization in Diabetic Retinopathy Grading [8.59772105902647]
Diabetic Retinopathy (DR) constitutes 5% of global blindness cases. We introduce a novel deep learning method for achieving domain generalization (DG) in DR grading. Our method demonstrates significant improvements over the strong Empirical Risk Minimization baseline.
arXiv Detail & Related papers (2024-11-04T21:09:24Z)
Disentangling Masked Autoencoders for Unsupervised Domain Generalization [57.56744870106124]
Unsupervised domain generalization is fast gaining attention but is still far from well-studied. Disentangled Masked Auto (DisMAE) aims to discover the disentangled representations that faithfully reveal intrinsic features. DisMAE co-trains the asymmetric dual-branch architecture with semantic and lightweight variation encoders.
arXiv Detail & Related papers (2024-07-10T11:11:36Z)
Generalizing to Unseen Domains in Diabetic Retinopathy Classification [8.59772105902647]
We study the problem of generalizing a model to unseen distributions or domains in diabetic retinopathy classification. We propose a simple and effective domain generalization (DG) approach that achieves self-distillation in vision transformers. We report the performance of several state-of-the-art DG methods on open-source DR classification datasets.
arXiv Detail & Related papers (2023-10-26T09:11:55Z)
Generalizing Across Domains in Diabetic Retinopathy via Variational Autoencoders [0.0]
Domain generalization for Diabetic Retinopathy classification allows a model to adeptly classify retinal images. In this study, we explore the inherent capacity of variational autoencoders to disentangle the latent space of fundus images.
arXiv Detail & Related papers (2023-09-20T13:29:22Z)
DGM-DR: Domain Generalization with Mutual Information Regularized Diabetic Retinopathy Classification [40.35834579068518]
Domain shift between training and testing data presents a significant challenge for training general deep learning models. We introduce a DG method that re-establishes the model objective function as a pretrained model to the medical imaging field. Our proposed method consistently outperforms the previous state-of-the-art by a margin of 5.25% in average accuracy and a lower standard deviation.
arXiv Detail & Related papers (2023-09-18T11:17:13Z)
Towards Generalizable Diabetic Retinopathy Grading in Unseen Domains [6.147573427718534]
We propose a novel unified framework named Generalizable Diabetic Retinopathy Grading Network (GDRNet) GDRNet consists of three vital components: fundus visual-artifact augmentation (FundusAug), dynamic hybrid-supervised loss (DahLoss), and domain-class-aware re-balancing (DCR)
arXiv Detail & Related papers (2023-07-10T07:24:44Z)
Cross-Site Severity Assessment of COVID-19 from CT Images via Domain Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event. To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites. This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z)
Self-Supervised Domain Adaptation for Diabetic Retinopathy Grading using Vessel Image Reconstruction [61.58601145792065]
We learn invariant target-domain features by defining a novel self-supervised task based on retinal vessel image reconstructions. It can be shown that our approach outperforms existing domain strategies.
arXiv Detail & Related papers (2021-07-20T09:44:07Z)
Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization [52.82138296332476]
This paper makes a first attempt to formulate cross-domain Dysarthric speech detection (DSD) as an unsupervised domain adaptation problem. We propose a multi-task learning strategy, including dysarthria presence classification (DPC), domain adversarial training ( DAT) and mutual information minimization (MIM) Experiments show that the incorporation of UDA attains absolute increases of 22.2% and 20.0% respectively in utterance-level weighted average recall and speaker-level accuracy.
arXiv Detail & Related papers (2021-06-18T13:34:36Z)
Cross-Modality Brain Tumor Segmentation via Bidirectional Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme. Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor. The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z)
Collaborative Unsupervised Domain Adaptation for Medical Image Diagnosis [102.40869566439514]
We seek to exploit rich labeled data from relevant domains to help the learning in the target task via Unsupervised Domain Adaptation (UDA) Unlike most UDA methods that rely on clean labeled data or assume samples are equally transferable, we innovatively propose a Collaborative Unsupervised Domain Adaptation algorithm. We theoretically analyze the generalization performance of the proposed method, and also empirically evaluate it on both medical and general images.
arXiv Detail & Related papers (2020-07-05T11:49:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.