SAMIR, an efficient registration framework via robust feature learning from SAM
- URL: http://arxiv.org/abs/2509.13629v1
- Date: Wed, 17 Sep 2025 01:56:35 GMT
- Title: SAMIR, an efficient registration framework via robust feature learning from SAM
- Authors: Yue He, Min Liu, Qinghao Liu, Jiazheng Wang, Yaonan Wang, Hang Zhang, Xiang Chen,
- Abstract summary: This paper introduces SAMIR, an efficient medical image registration framework.<n>SAM is pretrained on large-scale natural image datasets and can learn robust, general-purpose visual representations.<n>We show that SAMIR significantly outperforms state-of-the-art methods on benchmark datasets for both intra-subject cardiac image registration and inter-subject abdomen CT image registration.
- Score: 40.09295562721889
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image registration is a fundamental task in medical image analysis. Deformations are often closely related to the morphological characteristics of tissues, making accurate feature extraction crucial. Recent weakly supervised methods improve registration by incorporating anatomical priors such as segmentation masks or landmarks, either as inputs or in the loss function. However, such weak labels are often not readily available, limiting their practical use. Motivated by the strong representation learning ability of visual foundation models, this paper introduces SAMIR, an efficient medical image registration framework that utilizes the Segment Anything Model (SAM) to enhance feature extraction. SAM is pretrained on large-scale natural image datasets and can learn robust, general-purpose visual representations. Rather than using raw input images, we design a task-specific adaptation pipeline using SAM's image encoder to extract structure-aware feature embeddings, enabling more accurate modeling of anatomical consistency and deformation patterns. We further design a lightweight 3D head to refine features within the embedding space, adapting to local deformations in medical images. Additionally, we introduce a Hierarchical Feature Consistency Loss to guide coarse-to-fine feature matching and improve anatomical alignment. Extensive experiments demonstrate that SAMIR significantly outperforms state-of-the-art methods on benchmark datasets for both intra-subject cardiac image registration and inter-subject abdomen CT image registration, achieving performance improvements of 2.68% on ACDC and 6.44% on the abdomen dataset. The source code will be publicly available on GitHub following the acceptance of this paper.
Related papers
- Improving Generalization of Medical Image Registration Foundation Model [12.144724550118756]
This paper incorporates Sharpness-Aware Minimization into foundation models to enhance generalization and robustness in medical image registration.<n> Experimental results show that foundation models integrated with SAM achieve significant improvements in cross-dataset registration performance.
arXiv Detail & Related papers (2025-05-10T06:14:09Z) - Adapting a Segmentation Foundation Model for Medical Image Classification [13.711279542090043]
We introduce a new framework to adapt the Segment Anything Model (SAM) for medical image classification.<n>First, we utilize the SAM image encoder as a feature extractor to capture segmentation-based features.<n>Next, we propose a novel Spatially Localized Channel Attention (SLCA) mechanism to compute spatially localized attention weights for the feature maps.
arXiv Detail & Related papers (2025-05-09T17:51:51Z) - PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - IMPACT: A Generic Semantic Loss for Multimodal Medical Image Registration [0.46904601975060667]
IMPACT (Image Metric with Pretrained model-Agnostic Comparison for Transmodality registration) is a novel similarity metric designed for robust multimodal image registration.<n>It defines a semantic similarity measure based on the comparison of deep features extracted from large-scale pretrained segmentation models.<n>It was evaluated on five challenging 3D registration tasks involving thoracic CT/CBCT and pelvic MR/CT datasets.
arXiv Detail & Related papers (2025-03-31T14:08:21Z) - Medical Image Registration Meets Vision Foundation Model: Prototype Learning and Contour Awareness [11.671950446844356]
Existing deformable registration methods rely solely on intensity-based similarity metrics, lacking explicit anatomical knowledge.<n>We propose a novel SAM-assisted registration framework incorporating prototype learning and contour awareness.<n>Our framework significantly outperforms existing methods across multiple datasets.
arXiv Detail & Related papers (2025-02-17T04:54:47Z) - LDM-Morph: Latent diffusion model guided deformable image registration [2.8195553455247317]
We propose LDM-Morph, an unsupervised deformable registration algorithm for medical image registration.
LDM-Morph integrated features extracted from the latent diffusion model (LDM) to enrich the semantic information.
Extensive experiments on four public 2D cardiac image datasets show that the proposed LDM-Morph framework outperformed existing state-of-the-art CNNs- and Transformers-based registration methods.
arXiv Detail & Related papers (2024-11-23T03:04:36Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks.
In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics.
Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.