MRN: Harnessing 2D Vision Foundation Models for Diagnosing Parkinson's Disease with Limited 3D MR Data
- URL: http://arxiv.org/abs/2509.17566v1
- Date: Mon, 22 Sep 2025 10:59:27 GMT
- Title: MRN: Harnessing 2D Vision Foundation Models for Diagnosing Parkinson's Disease with Limited 3D MR Data
- Authors: Ding Shaodong, Liu Ziyang, Zhou Yijun, Liu Tao,
- Abstract summary: Current clinical practice often relies on diagnostic biomarkers in QSM and NM-MRI images.<n>We address these challenges by leveraging 2D vision foundation models (VFMs)<n>Our approach achieved first place in the MICCAI 2025 PDCADxFoundation challenge, with an accuracy of 86.4% trained on a dataset of only 300 labeled QSM and NM-MRI scans.
- Score: 0.6183104361749774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The automatic diagnosis of Parkinson's disease is in high clinical demand due to its prevalence and the importance of targeted treatment. Current clinical practice often relies on diagnostic biomarkers in QSM and NM-MRI images. However, the lack of large, high-quality datasets makes training diagnostic models from scratch prone to overfitting. Adapting pre-trained 3D medical models is also challenging, as the diversity of medical imaging leads to mismatches in voxel spacing and modality between pre-training and fine-tuning data. In this paper, we address these challenges by leveraging 2D vision foundation models (VFMs). Specifically, we crop multiple key ROIs from NM and QSM images, process each ROI through separate branches to compress the ROI into a token, and then combine these tokens into a unified patient representation for classification. Within each branch, we use 2D VFMs to encode axial slices of the 3D ROI volume and fuse them into the ROI token, guided by an auxiliary segmentation head that steers the feature extraction toward specific brain nuclei. Additionally, we introduce multi-ROI supervised contrastive learning, which improves diagnostic performance by pulling together representations of patients from the same class while pushing away those from different classes. Our approach achieved first place in the MICCAI 2025 PDCADxFoundation challenge, with an accuracy of 86.0% trained on a dataset of only 300 labeled QSM and NM-MRI scans, outperforming the second-place method by 5.5%.These results highlight the potential of 2D VFMs for clinical analysis of 3D MR images.
Related papers
- Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - Advancing Brain Tumor Segmentation via Attention-based 3D U-Net Architecture and Digital Image Processing [0.0]
This study aims to enhance the performance of brain tumor segmentation, ultimately improving the reliability of diagnosis.<n>The proposed model is thoroughly evaluated and assessed on the BraTS 2020 dataset using various performance metrics to accomplish this goal.
arXiv Detail & Related papers (2025-10-21T22:11:19Z) - Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses [2.4836875944302634]
We present a general, high-resolution SimCLR-based SSL foundation model for 3D brain structural MRI.<n>Our model still achieves superior performance when fine-tuned using only 20% of labeled training samples for predicting Alzheimer's disease.
arXiv Detail & Related papers (2025-09-12T18:05:08Z) - Data-Efficient Fine-Tuning of Vision-Language Models for Diagnosis of Alzheimer's Disease [3.46857682956989]
Medical vision-language models (Med-VLMs) have shown impressive results in tasks such as report generation and visual question answering.<n>Most existing models are typically trained from scratch or fine-tuned on large-scale 2D image-text pairs.<n>We propose a data-efficient fine-tuning pipeline to adapt 3D CT-based Med-VLMs for 3D MRI.
arXiv Detail & Related papers (2025-09-09T11:36:21Z) - M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision [24.846428105192405]
We train M3Ret, a unified visual encoder, without any modality-specific customization.<n>It successfully learns transferable representations using both generative (MAE) and contrastive (SimDINO) self-supervised learning (SSL) paradigms.<n>Our approach sets a new state-of-the-art in zero-shot image-to-image retrieval across all individual modalities, surpassing strong baselines such as DINOv3 and the text-supervised BMC-CLIP.
arXiv Detail & Related papers (2025-09-01T10:59:39Z) - Abnormality-Driven Representation Learning for Radiology Imaging [0.8321462983924758]
We introduce lesion-enhanced contrastive learning (LeCL), a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans.
We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models.
arXiv Detail & Related papers (2024-11-25T13:53:26Z) - 2D and 3D Deep Learning Models for MRI-based Parkinson's Disease Classification: A Comparative Analysis of Convolutional Kolmogorov-Arnold Networks, Convolutional Neural Networks, and Graph Convolutional Networks [0.0]
This study applies Convolutional Kolmogorov-Arnold Networks (ConvKANs) to Parkinson's Disease diagnosis.
ConvKANs integrate learnable activation functions into convolutional layers, for PD classification using structural MRI.
The first 3D implementation of ConvKANs for medical imaging is presented, comparing their performance to Convolutional Neural Networks (CNNs) and Graph Convolutional Networks (GCNs)
These findings highlight ConvKANs' potential for PD detection, emphasize the importance of 3D analysis in capturing subtle brain changes, and underscore cross-dataset generalization challenges.
arXiv Detail & Related papers (2024-07-24T16:04:18Z) - Multi-modal Masked Siamese Network Improves Chest X-Ray Representation Learning [46.674521557701816]
We propose to incorporate EHR data during self-supervised pretraining with a Masked Siamese Network (MSN) to enhance the quality of chest X-ray representations.
Our work highlights the potential of EHR-enhanced self-supervised pre-training for medical imaging.
arXiv Detail & Related papers (2024-07-05T12:04:12Z) - Super-resolution of biomedical volumes with 2D supervision [84.5255884646906]
Masked slice diffusion for super-resolution exploits the inherent equivalence in the data-generating distribution across all spatial dimensions of biological specimens.
We focus on the application of SliceR to stimulated histology (SRH), characterized by its rapid acquisition of high-resolution 2D images but slow and costly optical z-sectioning.
arXiv Detail & Related papers (2024-04-15T02:41:55Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Domain Transfer Through Image-to-Image Translation for Uncertainty-Aware Prostate Cancer Classification [42.75911994044675]
We present a novel approach for unpaired image-to-image translation of prostate MRIs and an uncertainty-aware training approach for classifying clinically significant PCa.
Our approach involves a novel pipeline for translating unpaired 3.0T multi-parametric prostate MRIs to 1.5T, thereby augmenting the available training data.
Our experiments demonstrate that the proposed method significantly improves the Area Under ROC Curve (AUC) by over 20% compared to the previous work.
arXiv Detail & Related papers (2023-07-02T05:26:54Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Slice-level Detection of Intracranial Hemorrhage on CT Using Deep
Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis.
We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge.
The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z) - Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.