Foundational Models in Medical Imaging: A Comprehensive Survey and
Future Vision
- URL: http://arxiv.org/abs/2310.18689v1
- Date: Sat, 28 Oct 2023 12:08:12 GMT
- Title: Foundational Models in Medical Imaging: A Comprehensive Survey and
Future Vision
- Authors: Bobby Azad, Reza Azad, Sania Eskandari, Afshin Bozorgpour, Amirhossein
Kazerouni, Islem Rekik, Dorit Merhof
- Abstract summary: Foundation models are large-scale, pre-trained deep-learning models adapted to a wide range of downstream tasks.
These models facilitate contextual reasoning, generalization, and prompt capabilities at test time.
Capitalizing on the advances in computer vision, medical imaging has also marked a growing interest in these models.
- Score: 6.2847894163744105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation models, large-scale, pre-trained deep-learning models adapted to a
wide range of downstream tasks have gained significant interest lately in
various deep-learning problems undergoing a paradigm shift with the rise of
these models. Trained on large-scale dataset to bridge the gap between
different modalities, foundation models facilitate contextual reasoning,
generalization, and prompt capabilities at test time. The predictions of these
models can be adjusted for new tasks by augmenting the model input with
task-specific hints called prompts without requiring extensive labeled data and
retraining. Capitalizing on the advances in computer vision, medical imaging
has also marked a growing interest in these models. To assist researchers in
navigating this direction, this survey intends to provide a comprehensive
overview of foundation models in the domain of medical imaging. Specifically,
we initiate our exploration by providing an exposition of the fundamental
concepts forming the basis of foundation models. Subsequently, we offer a
methodical taxonomy of foundation models within the medical domain, proposing a
classification system primarily structured around training strategies, while
also incorporating additional facets such as application domains, imaging
modalities, specific organs of interest, and the algorithms integral to these
models. Furthermore, we emphasize the practical use case of some selected
approaches and then discuss the opportunities, applications, and future
directions of these large-scale pre-trained models, for analyzing medical
images. In the same vein, we address the prevailing challenges and research
pathways associated with foundational models in medical imaging. These
encompass the areas of interpretability, data management, computational
requirements, and the nuanced issue of contextual comprehension.
Related papers
- Towards Scalable Foundation Models for Digital Dermatology [35.62296620281727]
We utilize self-supervised learning (SSL) techniques to pre-train models on a dataset of over 240,000 dermatological images.
Results show that models pre-trained in this work not only outperform general-purpose models but also approach the performance of models 50 times larger on clinically relevant diagnostic tasks.
arXiv Detail & Related papers (2024-11-08T12:19:20Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Beyond Pixel-Wise Supervision for Medical Image Segmentation: From Traditional Models to Foundation Models [7.987836953849249]
Existing segmentation algorithms mostly rely on the availability of fully annotated images with pixel-wise annotations for training.
To alleviate this challenge, there has been a growing focus on developing segmentation methods that can train deep models with weak annotations.
The emergence of vision foundation models, notably the Segment Anything Model (SAM), has introduced innovative capabilities for segmentation tasks using weak annotations.
arXiv Detail & Related papers (2024-04-20T02:40:49Z) - OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models
in Medicine [55.29668193415034]
We present OpenMEDLab, an open-source platform for multi-modality foundation models.
It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications.
It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
arXiv Detail & Related papers (2024-02-28T03:51:02Z) - Masked Modeling for Self-supervised Representation Learning on Vision
and Beyond [69.64364187449773]
Masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training.
We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more.
We conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research.
arXiv Detail & Related papers (2023-12-31T12:03:21Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Deep Learning Approaches for Data Augmentation in Medical Imaging: A
Review [2.8145809047875066]
We focus on three types of deep generative models for medical image augmentation: variational autoencoders, generative adversarial networks, and diffusion models.
We provide an overview of the current state of the art in each of these models and discuss their potential for use in different downstream tasks in medical imaging, including classification, segmentation, and cross-modal translation.
Our goal is to provide a comprehensive review about the use of deep generative models for medical image augmentation and to highlight the potential of these models for improving the performance of deep learning algorithms in medical image analysis.
arXiv Detail & Related papers (2023-07-24T20:53:59Z) - Empirical Analysis of a Segmentation Foundation Model in Prostate
Imaging [9.99042549094606]
We consider a recently developed foundation model for medical image segmentation, UniverSeg.
We conduct an empirical evaluation study in the context of prostate imaging and compare it against the conventional approach of training a task-specific segmentation model.
arXiv Detail & Related papers (2023-07-06T20:00:52Z) - On the Challenges and Perspectives of Foundation Models for Medical
Image Analysis [17.613533812925635]
Medical foundation models have immense potential in solving a wide range of downstream tasks.
They can help to accelerate the development of accurate and robust models, reduce the large amounts of required labeled data, preserve the privacy and confidentiality of patient data.
arXiv Detail & Related papers (2023-06-09T06:54:58Z) - Artificial General Intelligence for Medical Imaging Analysis [92.3940918983821]
Large-scale Artificial General Intelligence (AGI) models have achieved unprecedented success in a variety of general domain tasks.
These models face notable challenges arising from the medical field's inherent complexities and unique characteristics.
This review aims to offer insights into the future implications of AGI in medical imaging, healthcare, and beyond.
arXiv Detail & Related papers (2023-06-08T18:04:13Z) - Domain Shift in Computer Vision models for MRI data analysis: An
Overview [64.69150970967524]
Machine learning and computer vision methods are showing good performance in medical imagery analysis.
Yet only a few applications are now in clinical use.
Poor transferability of themodels to data from different sources or acquisition domains is one of the reasons for that.
arXiv Detail & Related papers (2020-10-14T16:34:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.