Foundation Models in Medical Image Analysis: A Systematic Review and Meta-Analysis
- URL: http://arxiv.org/abs/2510.16973v1
- Date: Sun, 19 Oct 2025 19:19:23 GMT
- Title: Foundation Models in Medical Image Analysis: A Systematic Review and Meta-Analysis
- Authors: Praveenbalaji Rajendran, Mojtaba Safari, Wenfeng He, Mingzhe Hu, Shansong Wang, Jun Zhou, Xiaofeng Yang,
- Abstract summary: Foundations models (FMs) have revolutionized medical image analysis, demonstrating strong zero- and few-shot performance across diverse medical imaging tasks.<n>FMs leverage large corpora of labeled and unlabeled multimodal datasets to learn generalized representations.<n>Despite the rapid proliferation of FM research in medical imaging, the field remains fragmented.<n>This review article provides a comprehensive and structured analysis of FMs in medical image analysis.
- Score: 7.905460364844281
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advancements in artificial intelligence (AI), particularly foundation models (FMs), have revolutionized medical image analysis, demonstrating strong zero- and few-shot performance across diverse medical imaging tasks, from segmentation to report generation. Unlike traditional task-specific AI models, FMs leverage large corpora of labeled and unlabeled multimodal datasets to learn generalized representations that can be adapted to various downstream clinical applications with minimal fine-tuning. However, despite the rapid proliferation of FM research in medical imaging, the field remains fragmented, lacking a unified synthesis that systematically maps the evolution of architectures, training paradigms, and clinical applications across modalities. To address this gap, this review article provides a comprehensive and structured analysis of FMs in medical image analysis. We systematically categorize studies into vision-only and vision-language FMs based on their architectural foundations, training strategies, and downstream clinical tasks. Additionally, a quantitative meta-analysis of the studies was conducted to characterize temporal trends in dataset utilization and application domains. We also critically discuss persistent challenges, including domain adaptation, efficient fine-tuning, computational constraints, and interpretability along with emerging solutions such as federated learning, knowledge distillation, and advanced prompting. Finally, we identify key future research directions aimed at enhancing the robustness, explainability, and clinical integration of FMs, thereby accelerating their translation into real-world medical practice.
Related papers
- Adaptation of Foundation Models for Medical Image Analysis: Strategies, Challenges, and Future Directions [4.332241609032423]
Foundation models (FMs) have emerged as a transformative paradigm in medical image analysis.<n>This review presents a comprehensive assessment of strategies for adapting FMs to the specific demands of medical imaging.
arXiv Detail & Related papers (2025-11-03T06:57:42Z) - MedAlign: A Synergistic Framework of Multimodal Preference Optimization and Federated Meta-Cognitive Reasoning [52.064286116035134]
We develop MedAlign, a framework to ensure visually accurate LVLM responses for Medical Visual Question Answering (Med-VQA)<n>We first propose a multimodal Direct Preference Optimization (mDPO) objective to align preference learning with visual context.<n>We then design a Retrieval-Aware Mixture-of-Experts (RA-MoE) architecture that utilizes image and text similarity to route queries to a specialized and context-augmented LVLM.
arXiv Detail & Related papers (2025-10-24T02:11:05Z) - Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z) - Brain Imaging Foundation Models, Are We There Yet? A Systematic Review of Foundation Models for Brain Imaging and Biomedical Research [6.113042369956893]
Foundation models (FMs) have revolutionized artificial intelligence and shown significant promise in medical imaging.<n>Brain imaging remains underrepresented, despite its critical role in the diagnosis and treatment of neurological diseases.<n>We present the first comprehensive and curated review of FMs for brain imaging.
arXiv Detail & Related papers (2025-06-16T09:46:46Z) - Vision Foundation Models in Medical Image Analysis: Advances and Challenges [7.224426395050136]
Vision Foundation Models (VFMs) have sparked significant advances in the field of medical image analysis.<n>This paper reviews the state-of-the-art research on the adaptation of VFMs to medical image segmentation.<n>We discuss the latest developments in adapter-based improvements, knowledge distillation techniques, and multi-scale contextual feature modeling.
arXiv Detail & Related papers (2025-02-20T14:13:46Z) - A Survey on Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluation Tasks [22.806228975730008]
Computational pathology foundation models (CPathFMs) have emerged as a powerful approach for analyzing histological data.<n>These models have demonstrated promise in automating complex pathology tasks such as segmentation, classification, and biomarker discovery.<n>However, the development of CPathFMs presents significant challenges, such as limited data accessibility, high variability across datasets, and lack of standardized evaluation benchmarks.
arXiv Detail & Related papers (2025-01-27T01:27:59Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for
Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training.
Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns.
Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z) - Graph Convolutional Networks for Multi-modality Medical Imaging:
Methods, Architectures, and Clinical Applications [13.940158397866625]
Development of graph convolutional networks (GCNs) has spawned a new wave of research in medical imaging analysis.
GCNs capabilities have spawned a new wave of research in medical imaging analysis with the overarching goal of improving quantitative disease understanding, monitoring, and diagnosis.
arXiv Detail & Related papers (2022-02-17T22:03:59Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z) - Domain Shift in Computer Vision models for MRI data analysis: An
Overview [64.69150970967524]
Machine learning and computer vision methods are showing good performance in medical imagery analysis.
Yet only a few applications are now in clinical use.
Poor transferability of themodels to data from different sources or acquisition domains is one of the reasons for that.
arXiv Detail & Related papers (2020-10-14T16:34:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.