Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
- URL: http://arxiv.org/abs/2508.07165v2
- Date: Mon, 25 Aug 2025 16:25:14 GMT
- Title: Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
- Authors: Zelin Qiu, Xi Wang, Zhuoyao Xie, Juan Zhou, Yu Wang, Lingjie Yang, Xinrui Jiang, Juyoung Bae, Moo Hyun Son, Qiang Ye, Dexuan Chen, Rui Zhang, Tao Li, Neeraj Ramesh Mahboobani, Varut Vardhanabhuti, Xiaohui Duan, Yinghua Zhao, Hao Chen,
- Abstract summary: In this study, we present PRISM, a foundation model PRe-trained with large-scale multI-Sequence MRI.<n>We propose a novel pretraining paradigm that disentangles anatomically invariant features from sequence-specific variations in MRI.<n>PRISM consistently outperformed both non-pretrained models and existing foundation models.
- Score: 15.846703688846086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-sequence Magnetic Resonance Imaging (MRI) offers remarkable versatility, enabling the distinct visualization of different tissue types. Nevertheless, the inherent heterogeneity among MRI sequences poses significant challenges to the generalization capability of deep learning models. These challenges undermine model performance when faced with varying acquisition parameters, thereby severely restricting their clinical utility. In this study, we present PRISM, a foundation model PRe-trained with large-scale multI-Sequence MRI. We collected a total of 64 datasets from both public and private sources, encompassing a wide range of whole-body anatomical structures, with scans spanning diverse MRI sequences. Among them, 336,476 volumetric MRI scans from 34 datasets (8 public and 26 private) were curated to construct the largest multi-organ multi-sequence MRI pretraining corpus to date. We propose a novel pretraining paradigm that disentangles anatomically invariant features from sequence-specific variations in MRI, while preserving high-level semantic representations. We established a benchmark comprising 44 downstream tasks, including disease diagnosis, image segmentation, registration, progression prediction, and report generation. These tasks were evaluated on 32 public datasets and 5 private cohorts. PRISM consistently outperformed both non-pretrained models and existing foundation models, achieving first-rank results in 39 out of 44 downstream benchmarks with statistical significance improvements. These results underscore its ability to learn robust and generalizable representations across unseen data acquired under diverse MRI protocols. PRISM provides a scalable framework for multi-sequence MRI analysis, thereby enhancing the translational potential of AI in radiology. It delivers consistent performance across diverse imaging protocols, reinforcing its clinical applicability.
Related papers
- Enabling Ultra-Fast Cardiovascular Imaging Across Heterogeneous Clinical Environments with a Generalist Foundation Model and Multimodal Database [64.65360708629485]
MMCMR-427K is the largest and most comprehensive multimodal cardiovascular magnetic resonance k-space database.<n> CardioMM is a reconstruction foundation model capable of adapting to heterogeneous fast CMR imaging scenarios.<n> CardioMM unifies semantic contextual understanding with physics-informed data consistency to deliver robust reconstructions.
arXiv Detail & Related papers (2025-12-25T12:47:50Z) - Decipher-MR: A Vision-Language Foundation Model for 3D MRI Representations [12.805804608410739]
We present Decipher-MR, a 3D MRI-specific vision-language foundation model trained on a large-scale dataset.<n>Decipher-MR integrates self-supervised vision learning with report-guided text supervision to build robust, generalizable representations.<n>Our results establish Decipher-MR as a scalable and versatile foundation for MRI-based AI, facilitating efficient development across clinical and research domains.
arXiv Detail & Related papers (2025-09-25T14:43:33Z) - OmniMRI: A Unified Vision--Language Foundation Model for Generalist MRI Interpretation [5.3427577036717]
We introduce OmniMRI, a unified vision-language foundation model designed to generalize across the entire MRI workflow.<n> OmniMRI is trained on a large-scale, heterogeneous corpus curated from 60 public datasets.<n>Results demonstrate OmniMRI's ability to perform diverse tasks within a single architecture.
arXiv Detail & Related papers (2025-08-24T21:11:28Z) - MRI-CORE: A Foundation Model for Magnetic Resonance Imaging [10.722046937558627]
We introduce the MRI-CORE, a vision foundation model trained using more than 6 million slices from over 110 thousand MRI volumes across 18 body locations.<n>Our experiments show notable improvements in performance over state-of-the-art methods in 13 data-restricted segmentation tasks, as well as in image classification, and zero-shot segmentation.<n>We also present data on which strategies yield most useful foundation models and a novel analysis relating similarity between pre-training and downstream task data with transfer learning performance.
arXiv Detail & Related papers (2025-06-13T19:26:56Z) - Towards a general-purpose foundation model for fMRI analysis [58.06455456423138]
We introduce NeuroSTORM, a framework that learns from 4D fMRI volumes and enables efficient knowledge transfer across diverse applications.<n>NeuroSTORM is pre-trained on 28.65 million fMRI frames (>9,000 hours) from over 50,000 subjects across multiple centers and ages 5 to 100.<n>It outperforms existing methods across five tasks: age/gender prediction, phenotype prediction, disease diagnosis, fMRI-to-image retrieval, and task-based fMRI.
arXiv Detail & Related papers (2025-06-11T23:51:01Z) - MRGen: Segmentation Data Engine for Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically important imaging modalities is challenging due to the scarcity of annotated data.<n>This paper investigates leveraging generative models to synthesize data, for training segmentation models for underrepresented modalities.<n>We present MRGen, a data engine for controllable medical image synthesis conditioned on text prompts and segmentation masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z) - Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation [51.28453192441364]
Multimodal brain magnetic resonance (MR) imaging is indispensable in neuroscience and neurology.
Current MR image synthesis approaches are typically trained on independent datasets for specific tasks.
We present TUMSyn, a Text-guided Universal MR image Synthesis model, which can flexibly generate brain MR images.
arXiv Detail & Related papers (2024-09-25T11:14:47Z) - Automated Classification of Body MRI Sequence Type Using Convolutional
Neural Networks [7.734037486455235]
We propose an automated method to classify the 3D MRI sequence acquired at the levels of the chest, abdomen, and pelvis.
To the best of our knowledge, we are the first to develop an automated method for the 3D classification of MRI sequences in the chest, abdomen, and pelvis.
arXiv Detail & Related papers (2024-02-12T22:34:57Z) - SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location
on MRI [13.912230325828943]
We propose a versatile, publicly available deep-learning model for bone segmentation in MRI across multiple standard MRI locations.
The proposed model can operate in two modes: fully automated segmentation and prompt-based segmentation.
Our contributions include (1) collecting and annotating a new MRI dataset across various MRI protocols, encompassing over 300 annotated volumes and 8485 annotated slices across diverse anatomic regions.
arXiv Detail & Related papers (2024-01-23T18:59:25Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Synthesis-based Imaging-Differentiation Representation Learning for
Multi-Sequence 3D/4D MRI [16.725225424047256]
We propose a sequence-to-sequence generation framework (Seq2Seq) for imaging-differentiation representation learning.
In this study, not only do we propose arbitrary 3D/4D sequence generation within one model to generate any specified target sequence, but also we are able to rank the importance of each sequence.
We conduct extensive experiments using three datasets including a toy dataset of 20,000 simulated subjects, a brain MRI dataset of 1,251 subjects, and a breast MRI dataset of 2,101 subjects.
arXiv Detail & Related papers (2023-02-01T15:37:35Z) - Integrative Imaging Informatics for Cancer Research: Workflow Automation
for Neuro-oncology (I3CR-WANO) [0.12175619840081271]
We propose an artificial intelligence-based solution for the aggregation and processing of multisequence neuro-Oncology MRI data.
Our end-to-end framework i) classifies MRI sequences using an ensemble classifier, ii) preprocesses the data in a reproducible manner, and iv) delineates tumor tissue subtypes.
It is robust to missing sequences and adopts an expert-in-the-loop approach, where the segmentation results may be manually refined by radiologists.
arXiv Detail & Related papers (2022-10-06T18:23:42Z) - Modality Completion via Gaussian Process Prior Variational Autoencoders
for Multi-Modal Glioma Segmentation [75.58395328700821]
We propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
MGP-VAE can leverage the Gaussian Process (GP) prior on the Variational Autoencoder (VAE) to utilize the subjects/patients and sub-modalities correlations.
We show the applicability of MGP-VAE on brain tumor segmentation where either, two, or three of four sub-modalities may be missing.
arXiv Detail & Related papers (2021-07-07T19:06:34Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.