SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic
Retinopathy Grading
- URL: http://arxiv.org/abs/2210.10969v5
- Date: Tue, 12 Mar 2024 11:59:39 GMT
- Title: SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic
Retinopathy Grading
- Authors: Yijin Huang, Junyan Lyu, Pujin Cheng, Roger Tam, Xiaoying Tang
- Abstract summary: Saliency-guided Self-Supervised image Transformer (SSiT) is proposed for Diabetic Retinopathy grading from fundus images.
We novelly introduce saliency maps into SSL, with a goal of guiding self-supervised pre-training with domain-specific prior knowledge.
- Score: 2.0790896742002274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised Learning (SSL) has been widely applied to learn image
representations through exploiting unlabeled images. However, it has not been
fully explored in the medical image analysis field. In this work,
Saliency-guided Self-Supervised image Transformer (SSiT) is proposed for
Diabetic Retinopathy (DR) grading from fundus images. We novelly introduce
saliency maps into SSL, with a goal of guiding self-supervised pre-training
with domain-specific prior knowledge. Specifically, two saliency-guided
learning tasks are employed in SSiT: (1) Saliency-guided contrastive learning
is conducted based on the momentum contrast, wherein fundus images' saliency
maps are utilized to remove trivial patches from the input sequences of the
momentum-updated key encoder. Thus, the key encoder is constrained to provide
target representations focusing on salient regions, guiding the query encoder
to capture salient features. (2) The query encoder is trained to predict the
saliency segmentation, encouraging the preservation of fine-grained information
in the learned representations. To assess our proposed method, four
publicly-accessible fundus image datasets are adopted. One dataset is employed
for pre-training, while the three others are used to evaluate the pre-trained
models' performance on downstream DR grading. The proposed SSiT significantly
outperforms other representative state-of-the-art SSL methods on all downstream
datasets and under various evaluation settings. For example, SSiT achieves a
Kappa score of 81.88% on the DDR dataset under fine-tuning evaluation,
outperforming all other ViT-based SSL methods by at least 9.48%.
Related papers
- Leveraging Self-Supervised Learning for Fetal Cardiac Planes Classification using Ultrasound Scan Videos [4.160910038127896]
Self-supervised learning (SSL) methods are popular since they can address situations with limited annotated data.
We study 7 SSL approaches based on reconstruction, contrastive loss, distillation, and information theory and evaluate them extensively on a large private US dataset.
Our primary observation shows that for SSL training, the variance of the dataset is more crucial than its size because it allows the model to learn generalisable representations.
arXiv Detail & Related papers (2024-07-31T16:47:21Z) - Intra-video Positive Pairs in Self-Supervised Learning for Ultrasound [65.23740556896654]
Self-supervised learning (SSL) is one strategy for addressing the paucity of labelled data in medical imaging.
In this study, we investigated the effect of utilizing proximal, distinct images from the same B-mode ultrasound video as pairs for SSL.
Named Intra-Video Positive Pairs (IVPP), the method surpassed previous ultrasound-specific contrastive learning methods' average test accuracy on COVID-19 classification.
arXiv Detail & Related papers (2024-03-12T14:57:57Z) - Learning Self-Supervised Representations for Label Efficient
Cross-Domain Knowledge Transfer on Diabetic Retinopathy Fundus Images [2.796274924103132]
This work presents a novel self-supervised representation learning-based approach for classifying diabetic retinopathy (DR) images in cross-domain settings.
The proposed method achieves state-of-the-art results on binary and multiclassification of DR images, even in cross-domain settings.
arXiv Detail & Related papers (2023-04-20T12:46:34Z) - Rethinking Self-Supervised Visual Representation Learning in
Pre-training for 3D Human Pose and Shape Estimation [57.206129938611454]
Self-supervised representation learning (SSL) methods have outperformed the ImageNet classification pre-training for vision tasks such as object detection.
We empirically study and analyze the effects of SSL and compare it with other pre-training alternatives for 3DHPSE.
Our observations challenge the naive application of the current SSL pre-training to 3DHPSE and relight the value of other data types in the pre-training aspect.
arXiv Detail & Related papers (2023-03-09T16:17:52Z) - Leveraging the Third Dimension in Contrastive Learning [88.17394309208925]
Self-Supervised Learning (SSL) methods operate on unlabeled data to learn robust representations useful for downstream tasks.
These augmentations ignore the fact that biological vision takes place in an immersive three-dimensional, temporally contiguous environment.
We explore two distinct approaches to incorporating depth signals into the SSL framework.
arXiv Detail & Related papers (2023-01-27T15:45:03Z) - Data-Limited Tissue Segmentation using Inpainting-Based Self-Supervised
Learning [3.7931881761831328]
Self-supervised learning (SSL) methods involving pretext tasks have shown promise in overcoming this requirement by first pretraining models using unlabeled data.
We evaluate the efficacy of two SSL methods (inpainting-based pretext tasks of context prediction and context restoration) for CT and MRI image segmentation in label-limited scenarios.
We demonstrate that optimally trained and easy-to-implement SSL segmentation models can outperform classically supervised methods for MRI and CT tissue segmentation in label-limited scenarios.
arXiv Detail & Related papers (2022-10-14T16:34:05Z) - PCA: Semi-supervised Segmentation with Patch Confidence Adversarial
Training [52.895952593202054]
We propose a new semi-supervised adversarial method called Patch Confidence Adrial Training (PCA) for medical image segmentation.
PCA learns the pixel structure and context information in each patch to get enough gradient feedback, which aids the discriminator in convergent to an optimal state.
Our method outperforms the state-of-the-art semi-supervised methods, which demonstrates its effectiveness for medical image segmentation.
arXiv Detail & Related papers (2022-07-24T07:45:47Z) - Semantic-aware Dense Representation Learning for Remote Sensing Image
Change Detection [20.761672725633936]
Training deep learning-based change detection model heavily depends on labeled data.
Recent trend is using remote sensing (RS) data to obtain in-domain representations via supervised or self-supervised learning (SSL)
We propose dense semantic-aware pre-training for RS image CD via sampling multiple class-balanced points.
arXiv Detail & Related papers (2022-05-27T06:08:33Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z) - Deep Q-Network-Driven Catheter Segmentation in 3D US by Hybrid
Constrained Semi-Supervised Learning and Dual-UNet [74.22397862400177]
We propose a novel catheter segmentation approach, which requests fewer annotations than the supervised learning method.
Our scheme considers a deep Q learning as the pre-localization step, which avoids voxel-level annotation.
With the detected catheter, patch-based Dual-UNet is applied to segment the catheter in 3D volumetric data.
arXiv Detail & Related papers (2020-06-25T21:10:04Z) - Contrastive learning of global and local features for medical image
segmentation with limited annotations [10.238403787504756]
A key requirement for the success of supervised deep learning is a large labeled dataset.
We propose strategies for extending the contrastive learning framework for segmentation of medical images in the semi-supervised setting.
In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques.
arXiv Detail & Related papers (2020-06-18T13:31:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.