CLIP-DR: Textual Knowledge-Guided Diabetic Retinopathy Grading with Ranking-aware Prompting
- URL: http://arxiv.org/abs/2407.04068v1
- Date: Thu, 4 Jul 2024 17:14:18 GMT
- Title: CLIP-DR: Textual Knowledge-Guided Diabetic Retinopathy Grading with Ranking-aware Prompting
- Authors: Qinkai Yu, Jianyang Xie, Anh Nguyen, He Zhao, Jiong Zhang, Huazhu Fu, Yitian Zhao, Yalin Zheng, Yanda Meng,
- Abstract summary: Diabetic retinopathy (DR) is a complication of diabetes and usually takes decades to reach sight-threatening levels.
Most current DR grading methods suffer from insufficient robustness to data variability.
We propose a novel DR grading framework CLIP-DR based on three observations.
- Score: 48.47935559597376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diabetic retinopathy (DR) is a complication of diabetes and usually takes decades to reach sight-threatening levels. Accurate and robust detection of DR severity is critical for the timely management and treatment of diabetes. However, most current DR grading methods suffer from insufficient robustness to data variability (\textit{e.g.} colour fundus images), posing a significant difficulty for accurate and robust grading. In this work, we propose a novel DR grading framework CLIP-DR based on three observations: 1) Recent pre-trained visual language models, such as CLIP, showcase a notable capacity for generalisation across various downstream tasks, serving as effective baseline models. 2) The grading of image-text pairs for DR often adheres to a discernible natural sequence, yet most existing DR grading methods have primarily overlooked this aspect. 3) A long-tailed distribution among DR severity levels complicates the grading process. This work proposes a novel ranking-aware prompting strategy to help the CLIP model exploit the ordinal information. Specifically, we sequentially design learnable prompts between neighbouring text-image pairs in two different ranking directions. Additionally, we introduce a Similarity Matrix Smooth module into the structure of CLIP to balance the class distribution. Finally, we perform extensive comparisons with several state-of-the-art methods on the GDRBench benchmark, demonstrating our CLIP-DR's robustness and superior performance. The implementation code is available \footnote{\url{https://github.com/Qinkaiyu/CLIP-DR}
Related papers
- PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - Generalizing to Unseen Domains in Diabetic Retinopathy Classification [8.59772105902647]
We study the problem of generalizing a model to unseen distributions or domains in diabetic retinopathy classification.
We propose a simple and effective domain generalization (DG) approach that achieves self-distillation in vision transformers.
We report the performance of several state-of-the-art DG methods on open-source DR classification datasets.
arXiv Detail & Related papers (2023-10-26T09:11:55Z) - RIDE: Self-Supervised Learning of Rotation-Equivariant Keypoint
Detection and Invariant Description for Endoscopy [83.4885991036141]
RIDE is a learning-based method for rotation-equivariant detection and invariant description.
It is trained in a self-supervised manner on a large curation of endoscopic images.
It sets a new state-of-the-art performance on matching and relative pose estimation tasks.
arXiv Detail & Related papers (2023-09-18T08:16:30Z) - How to Train Your DRAGON: Diverse Augmentation Towards Generalizable
Dense Retrieval [80.54532535622988]
We show that a generalizable dense retriever can be trained to achieve high accuracy in both supervised and zero-shot retrieval.
DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations.
arXiv Detail & Related papers (2023-02-15T03:53:26Z) - Segmentation, Classification, and Quality Assessment of UW-OCTA Images
for the Diagnosis of Diabetic Retinopathy [2.435307010444828]
Diabetic Retinopathy (DR) is a severe complication of diabetes that can cause blindness.
In this paper, we will present our solutions for the three tasks of the Diabetic Retinopathy Analysis Challenge 2022 (DRAC22)
The obtained results are promising and have allowed us to position ourselves in the TOP 5 of the segmentation task.
arXiv Detail & Related papers (2022-11-21T14:49:18Z) - Learning Discriminative Representations for Fine-Grained Diabetic
Retinopathy Grading [6.129288755571804]
Diabetic retinopathy is one of the leading causes of blindness.
To determine the disease severity levels, ophthalmologists need to focus on the discriminative parts of the fundus images.
arXiv Detail & Related papers (2020-11-04T04:16:55Z) - Sea-Net: Squeeze-And-Excitation Attention Net For Diabetic Retinopathy
Grading [9.181677987146418]
Diabetes is one of the most common disease in individuals.
Diabetic retinopathy (DR) is a complication of diabetes, which could lead to blindness.
DR grading based on retinal images provides a great diagnostic and prognostic value for treatment planning.
arXiv Detail & Related papers (2020-10-29T03:48:01Z) - A Benchmark for Studying Diabetic Retinopathy: Segmentation, Grading,
and Transferability [76.64661091980531]
People with diabetes are at risk of developing diabetic retinopathy (DR)
Computer-aided DR diagnosis is a promising tool for early detection of DR and severity grading.
This dataset has 1,842 images with pixel-level DR-related lesion annotations, and 1,000 images with image-level labels graded by six board-certified ophthalmologists.
arXiv Detail & Related papers (2020-08-22T07:48:04Z) - Robust Collaborative Learning of Patch-level and Image-level Annotations
for Diabetic Retinopathy Grading from Fundus Image [33.904136933213735]
We present a robust framework, which collaboratively utilizes patch-level and image-level annotations, for DR severity grading.
By an end-to-end optimization, this framework can bi-directionally exchange the fine-grained lesion and image-level grade information.
The proposed framework shows better performance than the recent state-of-the-art algorithms and three clinical ophthalmologists with over nine years of experience.
arXiv Detail & Related papers (2020-08-03T02:17:42Z) - Deep Q-Network-Driven Catheter Segmentation in 3D US by Hybrid
Constrained Semi-Supervised Learning and Dual-UNet [74.22397862400177]
We propose a novel catheter segmentation approach, which requests fewer annotations than the supervised learning method.
Our scheme considers a deep Q learning as the pre-localization step, which avoids voxel-level annotation.
With the detected catheter, patch-based Dual-UNet is applied to segment the catheter in 3D volumetric data.
arXiv Detail & Related papers (2020-06-25T21:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.