Towards Generalisable Foundation Models for 3D Brain MRI
- URL: http://arxiv.org/abs/2510.23415v1
- Date: Mon, 27 Oct 2025 15:19:46 GMT
- Title: Towards Generalisable Foundation Models for 3D Brain MRI
- Authors: Moona Mazher, Geoff J. M. Parker, Daniel C. Alexander,
- Abstract summary: We introduce BrainFound, a self-supervised foundation model for brain MRI built by extending DINO-v2.<n>BrainFound adapts DINO-v2 to model full 3D brain anatomy by incorporating information from sequential MRI slices.<n>It supports both single- and multimodal inputs, enabling a broad range of downstream tasks, including disease detection and image segmentation.
- Score: 5.527537739064968
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Foundation models in artificial intelligence (AI) are transforming medical imaging by enabling general-purpose feature learning from large-scale, unlabeled datasets. In this work, we introduce BrainFound, a self-supervised foundation model for brain MRI, built by extending DINO-v2, a vision transformer originally designed for 2D natural images. BrainFound adapts DINO-v2 to model full 3D brain anatomy by incorporating volumetric information from sequential MRI slices, moving beyond conventional single-slice paradigms. It supports both single- and multimodal inputs, enabling a broad range of downstream tasks, including disease detection and image segmentation, while generalising across varied imaging protocols and clinical scenarios. We show that BrainFound consistently outperforms existing self-supervised pretraining strategies and supervised baselines, particularly in label-scarce and multi-contrast settings. By integrating information from diverse 3D MRI modalities (e.g., T1, T2, FLAIR), it enhances diagnostic accuracy and reduces dependency on extensive expert annotations. This flexibility makes BrainFound a scalable and practical solution for 3D neuroimaging pipelines, with significant potential for clinical deployment and research innovation.
Related papers
- Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses [2.4836875944302634]
We present a general, high-resolution SimCLR-based SSL foundation model for 3D brain structural MRI.<n>Our model still achieves superior performance when fine-tuned using only 20% of labeled training samples for predicting Alzheimer's disease.
arXiv Detail & Related papers (2025-09-12T18:05:08Z) - Does DINOv3 Set a New Medical Vision Standard? [67.33543059306938]
This report investigates whether DINOv3 can serve as a powerful unified encoder for medical vision tasks without domain-specific pre-training.<n>We benchmark DINOv3 across common medical vision tasks, including 2D/3D classification and segmentation.<n>Remarkably, it can even outperform medical-specific foundation models like BiomedCLIP and CT-Net on several tasks.
arXiv Detail & Related papers (2025-09-08T09:28:57Z) - M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision [24.846428105192405]
We train M3Ret, a unified visual encoder, without any modality-specific customization.<n>It successfully learns transferable representations using both generative (MAE) and contrastive (SimDINO) self-supervised learning (SSL) paradigms.<n>Our approach sets a new state-of-the-art in zero-shot image-to-image retrieval across all individual modalities, surpassing strong baselines such as DINOv3 and the text-supervised BMC-CLIP.
arXiv Detail & Related papers (2025-09-01T10:59:39Z) - BrainSegFounder: Towards 3D Foundation Models for Neuroimage Segmentation [6.5388528484686885]
This study introduces a novel approach towards the creation of medical foundation models.
Our method involves a novel two-stage pretraining approach using vision transformers.
BrainFounder demonstrates a significant performance gain, surpassing the achievements of previous winning solutions.
arXiv Detail & Related papers (2024-06-14T19:49:45Z) - MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer.
This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation.
Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - Brain3D: Generating 3D Objects from fMRI [78.46936519561298]
We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject.<n>We show that our model captures the distinct functionalities of each region of human vision system.<n>Preliminary evaluations indicate that Brain3D can successfully identify the disordered brain regions in simulated scenarios.
arXiv Detail & Related papers (2024-05-24T06:06:11Z) - Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction [8.63068449082585]
Decoding non-invasive brain recordings is pivotal for advancing our understanding of human cognition.
Our framework integrates 3D brain structures with visual semantics using a Vision Transformer 3D.
We have enhanced the fMRI dataset with diverse fMRI-image-related textual data to support multimodal large model development.
arXiv Detail & Related papers (2024-04-30T10:41:23Z) - MindBridge: A Cross-Subject Brain Decoding Framework [60.58552697067837]
Brain decoding aims to reconstruct stimuli from acquired brain signals.
Currently, brain decoding is confined to a per-subject-per-model paradigm.
We present MindBridge, that achieves cross-subject brain decoding by employing only one model.
arXiv Detail & Related papers (2024-04-11T15:46:42Z) - fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for
Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training.
Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns.
Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z) - 3D Brainformer: 3D Fusion Transformer for Brain Tumor Segmentation [6.127298607534532]
Deep learning has recently emerged to improve brain tumor segmentation.
Transformers have been leveraged to address the limitations of convolutional networks.
We propose a 3D Transformer-based segmentation approach.
arXiv Detail & Related papers (2023-04-28T02:11:29Z) - Medical Transformer: Universal Brain Encoder for 3D MRI Analysis [1.6287500717172143]
Existing 3D-based methods have transferred the pre-trained models to downstream tasks.
They demand a massive amount of parameters to train the model for 3D medical imaging.
We propose a novel transfer learning framework, called Medical Transformer, that effectively models 3D volumetric images in the form of a sequence of 2D image slices.
arXiv Detail & Related papers (2021-04-28T08:34:21Z) - Hierarchical Amortized Training for Memory-efficient High Resolution 3D
GAN [52.851990439671475]
We propose a novel end-to-end GAN architecture that can generate high-resolution 3D images.
We achieve this goal by using different configurations between training and inference.
Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation.
arXiv Detail & Related papers (2020-08-05T02:33:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.