Related papers: Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision

Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision

URL: http://arxiv.org/abs/2310.18689v1
Date: Sat, 28 Oct 2023 12:08:12 GMT
Title: Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision
Authors: Bobby Azad, Reza Azad, Sania Eskandari, Afshin Bozorgpour, Amirhossein Kazerouni, Islem Rekik, Dorit Merhof
Abstract summary: Foundation models are large-scale, pre-trained deep-learning models adapted to a wide range of downstream tasks. These models facilitate contextual reasoning, generalization, and prompt capabilities at test time. Capitalizing on the advances in computer vision, medical imaging has also marked a growing interest in these models.
Score: 6.2847894163744105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Foundation models, large-scale, pre-trained deep-learning models adapted to a wide range of downstream tasks have gained significant interest lately in various deep-learning problems undergoing a paradigm shift with the rise of these models. Trained on large-scale dataset to bridge the gap between different modalities, foundation models facilitate contextual reasoning, generalization, and prompt capabilities at test time. The predictions of these models can be adjusted for new tasks by augmenting the model input with task-specific hints called prompts without requiring extensive labeled data and retraining. Capitalizing on the advances in computer vision, medical imaging has also marked a growing interest in these models. To assist researchers in navigating this direction, this survey intends to provide a comprehensive overview of foundation models in the domain of medical imaging. Specifically, we initiate our exploration by providing an exposition of the fundamental concepts forming the basis of foundation models. Subsequently, we offer a methodical taxonomy of foundation models within the medical domain, proposing a classification system primarily structured around training strategies, while also incorporating additional facets such as application domains, imaging modalities, specific organs of interest, and the algorithms integral to these models. Furthermore, we emphasize the practical use case of some selected approaches and then discuss the opportunities, applications, and future directions of these large-scale pre-trained models, for analyzing medical images. In the same vein, we address the prevailing challenges and research pathways associated with foundational models in medical imaging. These encompass the areas of interpretability, data management, computational requirements, and the nuanced issue of contextual comprehension.

Related papers

Multi-Modal Foundation Models for Computational Pathology: A Survey [32.25958653387204]
Foundation models have emerged as a powerful paradigm in computational pathology (CPath) We categorize 32 state-of-the-art multi-modal foundation models into three major paradigms: vision-language, vision-knowledge graph, and vision-gene expression. We analyze 28 available multi-modal datasets tailored for pathology, grouped into image-text pairs, instruction datasets, and image-other modality pairs.
arXiv Detail & Related papers (2025-03-12T06:03:33Z)
Biomedical Foundation Model: A Survey [84.26268124754792]
Foundation models are large-scale pre-trained models that learn from extensive unlabeled datasets. These models can be adapted to various applications such as question answering and visual understanding. This survey explores the potential of foundation models across diverse domains within biomedical fields.
arXiv Detail & Related papers (2025-03-03T22:42:00Z)
Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST [7.017817009055001]
We study the capabilities of foundation models in medical image classification tasks by conducting a benchmark study on the MedMNIST dataset. We adopt various foundation models ranging from convolutional to Transformer-based models and implement both end-to-end training and linear probing for all classification tasks.
arXiv Detail & Related papers (2025-01-24T18:01:07Z)
Foundation Models in Radiology: What, How, When, Why and Why Not [15.314933454351674]
Recent advances in artificial intelligence have witnessed the emergence of large-scale deep learning models capable of interpreting and generating both textual and imaging data. Foundation models have recently received extensive attention from academic, industry, and regulatory bodies. This review aims to establish a standardized terminology concerning foundation models.
arXiv Detail & Related papers (2024-11-27T20:13:01Z)
Towards Scalable Foundation Models for Digital Dermatology [35.62296620281727]
We utilize self-supervised learning (SSL) techniques to pre-train models on a dataset of over 240,000 dermatological images. Results show that models pre-trained in this work not only outperform general-purpose models but also approach the performance of models 50 times larger on clinically relevant diagnostic tasks.
arXiv Detail & Related papers (2024-11-08T12:19:20Z)
Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed. In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset. We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z)
Beyond Pixel-Wise Supervision for Medical Image Segmentation: From Traditional Models to Foundation Models [7.987836953849249]
Existing segmentation algorithms mostly rely on the availability of fully annotated images with pixel-wise annotations for training. To alleviate this challenge, there has been a growing focus on developing segmentation methods that can train deep models with weak annotations. The emergence of vision foundation models, notably the Segment Anything Model (SAM), has introduced innovative capabilities for segmentation tasks using weak annotations.
arXiv Detail & Related papers (2024-04-20T02:40:49Z)
OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine [55.29668193415034]
We present OpenMEDLab, an open-source platform for multi-modality foundation models. It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications. It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
arXiv Detail & Related papers (2024-02-28T03:51:02Z)
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond [69.64364187449773]
Masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training. We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more. We conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research.
arXiv Detail & Related papers (2023-12-31T12:03:21Z)
Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time. The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z)
Deep Learning Approaches for Data Augmentation in Medical Imaging: A Review [2.8145809047875066]
We focus on three types of deep generative models for medical image augmentation: variational autoencoders, generative adversarial networks, and diffusion models. We provide an overview of the current state of the art in each of these models and discuss their potential for use in different downstream tasks in medical imaging, including classification, segmentation, and cross-modal translation. Our goal is to provide a comprehensive review about the use of deep generative models for medical image augmentation and to highlight the potential of these models for improving the performance of deep learning algorithms in medical image analysis.
arXiv Detail & Related papers (2023-07-24T20:53:59Z)
Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging [9.99042549094606]
We consider a recently developed foundation model for medical image segmentation, UniverSeg. We conduct an empirical evaluation study in the context of prostate imaging and compare it against the conventional approach of training a task-specific segmentation model.
arXiv Detail & Related papers (2023-07-06T20:00:52Z)
On the Challenges and Perspectives of Foundation Models for Medical Image Analysis [17.613533812925635]
Medical foundation models have immense potential in solving a wide range of downstream tasks. They can help to accelerate the development of accurate and robust models, reduce the large amounts of required labeled data, preserve the privacy and confidentiality of patient data.
arXiv Detail & Related papers (2023-06-09T06:54:58Z)
Artificial General Intelligence for Medical Imaging Analysis [92.3940918983821]
Large-scale Artificial General Intelligence (AGI) models have achieved unprecedented success in a variety of general domain tasks. These models face notable challenges arising from the medical field's inherent complexities and unique characteristics. This review aims to offer insights into the future implications of AGI in medical imaging, healthcare, and beyond.
arXiv Detail & Related papers (2023-06-08T18:04:13Z)
Domain Shift in Computer Vision models for MRI data analysis: An Overview [64.69150970967524]
Machine learning and computer vision methods are showing good performance in medical imagery analysis. Yet only a few applications are now in clinical use. Poor transferability of themodels to data from different sources or acquisition domains is one of the reasons for that.
arXiv Detail & Related papers (2020-10-14T16:34:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.