Self-Supervised Pre-Training with Contrastive and Masked Autoencoder
Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging
- URL: http://arxiv.org/abs/2308.06534v4
- Date: Thu, 2 Nov 2023 09:56:55 GMT
- Title: Self-Supervised Pre-Training with Contrastive and Masked Autoencoder
Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging
- Authors: Daniel Wolf, Tristan Payer, Catharina Silvia Lisson, Christoph Gerhard
Lisson, Meinrad Beer, Michael G\"otz, Timo Ropinski
- Abstract summary: Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis.
Training such deep learning models requires large and accurate datasets, with annotations for all training samples.
To address this challenge, deep learning models can be pre-trained on large image datasets without annotations using methods from the field of self-supervised learning.
- Score: 8.34398674359296
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning in medical imaging has the potential to minimize the risk of
diagnostic errors, reduce radiologist workload, and accelerate diagnosis.
Training such deep learning models requires large and accurate datasets, with
annotations for all training samples. However, in the medical imaging domain,
annotated datasets for specific tasks are often small due to the high
complexity of annotations, limited access, or the rarity of diseases. To
address this challenge, deep learning models can be pre-trained on large image
datasets without annotations using methods from the field of self-supervised
learning. After pre-training, small annotated datasets are sufficient to
fine-tune the models for a specific task. The most popular self-supervised
pre-training approaches in medical imaging are based on contrastive learning.
However, recent studies in natural image processing indicate a strong potential
for masked autoencoder approaches. Our work compares state-of-the-art
contrastive learning methods with the recently introduced masked autoencoder
approach "SparK" for convolutional neural networks (CNNs) on medical images.
Therefore we pre-train on a large unannotated CT image dataset and fine-tune on
several CT classification tasks. Due to the challenge of obtaining sufficient
annotated training data in medical imaging, it is of particular interest to
evaluate how the self-supervised pre-training methods perform when fine-tuning
on small datasets. By experimenting with gradually reducing the training
dataset size for fine-tuning, we find that the reduction has different effects
depending on the type of pre-training chosen. The SparK pre-training method is
more robust to the training dataset size than the contrastive methods. Based on
our results, we propose the SparK pre-training for medical imaging tasks with
only small annotated datasets.
Related papers
- Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Self-Supervised Pretraining for 2D Medical Image Segmentation [0.0]
Self-supervised learning offers a way to lower the need for manually annotated data by pretraining models for a specific domain on unlabelled data.
We find that self-supervised pretraining on natural images and target-domain-specific images leads to the fastest and most stable downstream convergence.
In low-data scenarios, supervised ImageNet pretraining achieves the best accuracy, requiring less than 100 annotated samples to realise close to minimal error.
arXiv Detail & Related papers (2022-09-01T09:25:22Z) - Compound Figure Separation of Biomedical Images: Mining Large Datasets
for Self-supervised Learning [12.445324044675116]
We introduce a simulation-based training framework that minimizes the need for resource extensive bounding box annotations.
We also propose a new side loss that is optimized for compound figure separation.
This is the first study that evaluates the efficacy of leveraging self-supervised learning with compound image separation.
arXiv Detail & Related papers (2022-08-30T16:02:34Z) - RadTex: Learning Efficient Radiograph Representations from Text Reports [7.090896766922791]
We build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data.
Our model achieves higher classification performance than ImageNet-supervised pretraining when labeled training data is limited.
arXiv Detail & Related papers (2022-08-05T15:06:26Z) - Self-Supervised-RCNN for Medical Image Segmentation with Limited Data
Annotation [0.16490701092527607]
We propose an alternative deep learning training strategy based on self-supervised pretraining on unlabeled MRI scans.
Our pretraining approach first, randomly applies different distortions to random areas of unlabeled images and then predicts the type of distortions and loss of information.
The effectiveness of the proposed method for segmentation tasks in different pre-training and fine-tuning scenarios is evaluated.
arXiv Detail & Related papers (2022-07-17T13:28:52Z) - Self-Supervised Learning as a Means To Reduce the Need for Labeled Data
in Medical Image Analysis [64.4093648042484]
We use a dataset of chest X-ray images with bounding box labels for 13 different classes of anomalies.
We show that it is possible to achieve similar performance to a fully supervised model in terms of mean average precision and accuracy with only 60% of the labeled data.
arXiv Detail & Related papers (2022-06-01T09:20:30Z) - Intelligent Masking: Deep Q-Learning for Context Encoding in Medical
Image Analysis [48.02011627390706]
We develop a novel self-supervised approach that occludes targeted regions to improve the pre-training procedure.
We show that training the agent against the prediction model can significantly improve the semantic features extracted for downstream classification tasks.
arXiv Detail & Related papers (2022-03-25T19:05:06Z) - About Explicit Variance Minimization: Training Neural Networks for
Medical Imaging With Limited Data Annotations [2.3204178451683264]
Variance Aware Training (VAT) method exploits this property by introducing the variance error into the model loss function.
We validate VAT on three medical imaging datasets from diverse domains and various learning objectives.
arXiv Detail & Related papers (2021-05-28T21:34:04Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.