LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching
- URL: http://arxiv.org/abs/2306.11925v3
- Date: Sat, 18 Nov 2023 15:17:08 GMT
- Title: LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching
- Authors: Duy M. H. Nguyen, Hoang Nguyen, Nghiem T. Diep, Tan N. Pham, Tri Cao,
Binh T. Nguyen, Paul Swoboda, Nhat Ho, Shadi Albarqouni, Pengtao Xie, Daniel
Sonntag, Mathias Niepert
- Abstract summary: We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
- Score: 59.01894976615714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Obtaining large pre-trained models that can be fine-tuned to new tasks with
limited annotated samples has remained an open challenge for medical imaging
data. While pre-trained deep networks on ImageNet and vision-language
foundation models trained on web-scale data are prevailing approaches, their
effectiveness on medical tasks is limited due to the significant domain shift
between natural and medical images. To bridge this gap, we introduce LVM-Med,
the first family of deep networks trained on large-scale medical datasets. We
have collected approximately 1.3 million medical images from 55 publicly
available datasets, covering a large number of organs and modalities such as
CT, MRI, X-ray, and Ultrasound. We benchmark several state-of-the-art
self-supervised algorithms on this dataset and propose a novel self-supervised
contrastive learning algorithm using a graph-matching formulation. The proposed
approach makes three contributions: (i) it integrates prior pair-wise image
similarity metrics based on local and global information; (ii) it captures the
structural constraints of feature embeddings through a loss function
constructed via a combinatorial graph-matching objective; and (iii) it can be
trained efficiently end-to-end using modern gradient-estimation techniques for
black-box solvers. We thoroughly evaluate the proposed LVM-Med on 15 downstream
medical tasks ranging from segmentation and classification to object detection,
and both for the in and out-of-distribution settings. LVM-Med empirically
outperforms a number of state-of-the-art supervised, self-supervised, and
foundation models. For challenging tasks such as Brain Tumor Classification or
Diabetic Retinopathy Grading, LVM-Med improves previous vision-language models
trained on 1 billion masks by 6-7% while using only a ResNet-50.
Related papers
- LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model [55.80651780294357]
State-of-the-art medical multi-modal large language models (med-MLLM) leverage instruction-following data in pre-training.
LoGra-Med is a new multi-graph alignment algorithm that enforces triplet correlations across image modalities, conversation-based descriptions, and extended captions.
Our results show LoGra-Med matches LLAVA-Med performance on 600K image-text pairs for Medical VQA and significantly outperforms it when trained on 10% of the data.
arXiv Detail & Related papers (2024-10-03T15:52:03Z) - Self-Supervised Pre-Training with Contrastive and Masked Autoencoder
Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging [8.34398674359296]
Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis.
Training such deep learning models requires large and accurate datasets, with annotations for all training samples.
To address this challenge, deep learning models can be pre-trained on large image datasets without annotations using methods from the field of self-supervised learning.
arXiv Detail & Related papers (2023-08-12T11:31:01Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Pick the Best Pre-trained Model: Towards Transferability Estimation for
Medical Image Segmentation [20.03177073703528]
Transfer learning is a critical technique in training deep neural networks for the challenging medical image segmentation task.
We propose a new Transferability Estimation (TE) method for medical image segmentation.
Our method surpasses all current algorithms for transferability estimation in medical image segmentation.
arXiv Detail & Related papers (2023-07-22T01:58:18Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Customizing General-Purpose Foundation Models for Medical Report
Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks.
We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Slice-level Detection of Intracranial Hemorrhage on CT Using Deep
Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis.
We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge.
The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z) - About Explicit Variance Minimization: Training Neural Networks for
Medical Imaging With Limited Data Annotations [2.3204178451683264]
Variance Aware Training (VAT) method exploits this property by introducing the variance error into the model loss function.
We validate VAT on three medical imaging datasets from diverse domains and various learning objectives.
arXiv Detail & Related papers (2021-05-28T21:34:04Z) - Interpretable and synergistic deep learning for visual explanation and
statistical estimations of segmentation of disease features from medical
images [0.0]
Deep learning (DL) models for disease classification or segmentation from medical images are increasingly trained using transfer learning (TL) from unrelated natural world images.
We report detailed comparisons, rigorous statistical analysis and comparisons of widely used DL architecture for binary segmentation after TL.
A free GitHub repository of TII and LMI models, code and more than 10,000 medical images and their Grad-CAM output from this study can be used as starting points for advanced computational medicine.
arXiv Detail & Related papers (2020-11-11T14:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.