Metadata-enhanced contrastive learning from retinal optical coherence tomography images
- URL: http://arxiv.org/abs/2208.02529v3
- Date: Fri, 26 Jul 2024 15:07:08 GMT
- Title: Metadata-enhanced contrastive learning from retinal optical coherence tomography images
- Authors: Robbie Holland, Oliver Leingang, Hrvoje Bogunović, Sophie Riedl, Lars Fritsche, Toby Prevost, Hendrik P. N. Scholl, Ursula Schmidt-Erfurth, Sobha Sivaprasad, Andrew J. Lotery, Daniel Rueckert, Martin J. Menten,
- Abstract summary: We extend conventional contrastive frameworks with a novel metadata-enhanced strategy.
Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships.
Our approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks.
- Score: 7.932410831191909
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal OCT images of 7,912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.
Related papers
- Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework.
Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Adversarial-Robust Transfer Learning for Medical Imaging via Domain
Assimilation [17.46080957271494]
The scarcity of publicly available medical images has led contemporary algorithms to depend on pretrained models grounded on a large set of natural images.
A significant em domain discrepancy exists between natural and medical images, which causes AI models to exhibit heightened em vulnerability to adversarial attacks.
This paper proposes a em domain assimilation approach that introduces texture and color adaptation into transfer learning, followed by a texture preservation component to suppress undesired distortion.
arXiv Detail & Related papers (2024-02-25T06:39:15Z) - BiomedJourney: Counterfactual Biomedical Image Generation by
Instruction-Learning from Multimodal Patient Journeys [99.7082441544384]
We present BiomedJourney, a novel method for counterfactual biomedical image generation by instruction-learning.
We use GPT-4 to process the corresponding imaging reports and generate a natural language description of disease progression.
The resulting triples are then used to train a latent diffusion model for counterfactual biomedical image generation.
arXiv Detail & Related papers (2023-10-16T18:59:31Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - ContIG: Self-supervised Multimodal Contrastive Learning for Medical
Imaging with Genetics [4.907551775445731]
We propose ContIG, a self-supervised method that can learn from large datasets of unlabeled medical images and genetic data.
Our approach aligns images and several genetic modalities in the feature space using a contrastive loss.
We also perform genome-wide association studies on the features learned by our models, uncovering interesting relationships between images and genetic data.
arXiv Detail & Related papers (2021-11-26T11:06:12Z) - Positional Contrastive Learning for Volumetric Medical Image
Segmentation [13.086140606803408]
We propose a novel positional contrastive learning framework to generate contrastive data pairs.
The proposed PCL method can substantially improve the segmentation performance compared to existing methods in both semi-supervised setting and transfer learning setting.
arXiv Detail & Related papers (2021-06-16T22:15:28Z) - MedAug: Contrastive learning leveraging patient metadata improves
representations for chest X-ray interpretation [8.403653472706822]
We develop a method to select positive pairs coming from views of possibly different images through the use of patient metadata.
We compare strategies for selecting positive pairs for chest X-ray interpretation including requiring them to be from the same patient, imaging study or laterality.
Our best performing positive pair selection strategy, which involves using images from the same patient from the same study across all lateralities, achieves a performance increase of 3.4% and 14.4% in mean AUC.
arXiv Detail & Related papers (2021-02-21T18:39:04Z) - Contrastive Learning of Medical Visual Representations from Paired
Images and Text [38.91117443316013]
We propose ConVIRT, an unsupervised strategy to learn medical visual representations by exploiting naturally occurring descriptive paired text.
Our new method of pretraining medical image encoders with the paired text data via a bidirectional contrastive objective between the two modalities is domain-agnostic, and requires no additional expert input.
arXiv Detail & Related papers (2020-10-02T02:10:18Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.