Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?
- URL: http://arxiv.org/abs/2507.11569v1
- Date: Tue, 15 Jul 2025 00:17:14 GMT
- Title: Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?
- Authors: Hanxue Gu, Yaqian Chen, Nicholas Konz, Qihang Li, Maciej A. Mazurowski,
- Abstract summary: Foundation models, pre-trained on large image datasets, have recently shown potential for zero-shot image registration.<n>Breast MRI registration is particularly difficult due to significant anatomical variation between patients.<n>Further work is needed to understand how domain-specific training influences registration and to explore strategies that improve both global alignment and fine structure accuracy.
- Score: 2.2269713828088054
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation models, pre-trained on large image datasets and capable of capturing rich feature representations, have recently shown potential for zero-shot image registration. However, their performance has mostly been tested in the context of rigid or less complex structures, such as the brain or abdominal organs, and it remains unclear whether these models can handle more challenging, deformable anatomy. Breast MRI registration is particularly difficult due to significant anatomical variation between patients, deformation caused by patient positioning, and the presence of thin and complex internal structure of fibroglandular tissue, where accurate alignment is crucial. Whether foundation model-based registration algorithms can address this level of complexity remains an open question. In this study, we provide a comprehensive evaluation of foundation model-based registration algorithms for breast MRI. We assess five pre-trained encoders, including DINO-v2, SAM, MedSAM, SSLSAM, and MedCLIP, across four key breast registration tasks that capture variations in different years and dates, sequences, modalities, and patient disease status (lesion versus no lesion). Our results show that foundation model-based algorithms such as SAM outperform traditional registration baselines for overall breast alignment, especially under large domain shifts, but struggle with capturing fine details of fibroglandular tissue. Interestingly, additional pre-training or fine-tuning on medical or breast-specific images in MedSAM and SSLSAM, does not improve registration performance and may even decrease it in some cases. Further work is needed to understand how domain-specific training influences registration and to explore targeted strategies that improve both global alignment and fine structure accuracy. We also publicly release our code at \href{https://github.com/mazurowski-lab/Foundation-based-reg}{Github}.
Related papers
- Beyond the LUMIR challenge: The pathway to foundational registration models [25.05315856123745]
The Large-scale Unsupervised Brain MRI Image Registration (LUMIR) challenge is a next-generation benchmark designed to assess and advance unsupervised brain MRI registration.<n>LUMIR provides over 4,000 preprocessed T1-weighted brain MRIs for training without any label maps, encouraging biologically plausible deformation modeling.<n>A total of 1,158 subjects and over 4,000 image pairs were included for evaluation.
arXiv Detail & Related papers (2025-05-30T03:07:58Z) - Improving Generalization of Medical Image Registration Foundation Model [12.144724550118756]
This paper incorporates Sharpness-Aware Minimization into foundation models to enhance generalization and robustness in medical image registration.<n> Experimental results show that foundation models integrated with SAM achieve significant improvements in cross-dataset registration performance.
arXiv Detail & Related papers (2025-05-10T06:14:09Z) - Medical Image Registration Meets Vision Foundation Model: Prototype Learning and Contour Awareness [11.671950446844356]
Existing deformable registration methods rely solely on intensity-based similarity metrics, lacking explicit anatomical knowledge.<n>We propose a novel SAM-assisted registration framework incorporating prototype learning and contour awareness.<n>Our framework significantly outperforms existing methods across multiple datasets.
arXiv Detail & Related papers (2025-02-17T04:54:47Z) - Class Attention to Regions of Lesion for Imbalanced Medical Image
Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues.
The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.
Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - A Deep Discontinuity-Preserving Image Registration Network [73.03885837923599]
Most deep learning-based registration methods assume that the desired deformation fields are globally smooth and continuous.
We propose a weakly-supervised Deep Discontinuity-preserving Image Registration network (DDIR) to obtain better registration performance and realistic deformation fields.
We demonstrate that our method achieves significant improvements in registration accuracy and predicts more realistic deformations, in registration experiments on cardiac magnetic resonance (MR) images.
arXiv Detail & Related papers (2021-07-09T13:35:59Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z) - Multi-institutional Collaborations for Improving Deep Learning-based
Magnetic Resonance Image Reconstruction Using Federated Learning [62.17532253489087]
Deep learning methods have been shown to produce superior performance on MR image reconstruction.
These methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations.
We propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy.
arXiv Detail & Related papers (2021-03-03T03:04:40Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z) - Learning Deformable Registration of Medical Images with Anatomical
Constraints [4.397224870979238]
Deformable image registration is a fundamental problem in the field of medical image analysis.
We learn global non-linear representations of image anatomy using segmentation masks, and employ them to constraint the registration process.
Our experiments show that the proposed anatomically constrained registration model produces more realistic and accurate results than state-of-the-art methods.
arXiv Detail & Related papers (2020-01-20T17:44:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.