Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
- URL: http://arxiv.org/abs/2508.01697v1
- Date: Sun, 03 Aug 2025 10:00:44 GMT
- Title: Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
- Authors: Shiqi Huang, Tingfa Xu, Wen Yan, Dean Barratt, Yipeng Hu,
- Abstract summary: We present a novel registration algorithm that identifies multiple paired corresponding ROIs by marginalizing the inverted Prompt X across both prompt and spatial dimensions.<n>Based on metrics including Dice and target registration errors on anatomical structures, the proposed registration outperforms both intensity-based iterative algorithms and learning-based DDF-predicting networks.
- Score: 10.548119506549998
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Establishing pixel/voxel-level or region-level correspondences is the core challenge in image registration. The latter, also known as region-based correspondence representation, leverages paired regions of interest (ROIs) to enable regional matching while preserving fine-grained capability at pixel/voxel level. Traditionally, this representation is implemented via two steps: segmenting ROIs in each image then matching them between the two images. In this paper, we simplify this into one step by directly "searching for corresponding prompts", using extensively pre-trained segmentation models (e.g., SAM) for a training-free registration approach, PromptReg. Firstly, we introduce the "corresponding prompt problem", which aims to identify a corresponding Prompt Y in Image Y for any given visual Prompt X in Image X, such that the two respectively prompt-conditioned segmentations are a pair of corresponding ROIs from the two images. Secondly, we present an "inverse prompt" solution that generates primary and optionally auxiliary prompts, inverting Prompt X into the prompt space of Image Y. Thirdly, we propose a novel registration algorithm that identifies multiple paired corresponding ROIs by marginalizing the inverted Prompt X across both prompt and spatial dimensions. Comprehensive experiments are conducted on five applications of registering 3D prostate MR, 3D abdomen MR, 3D lung CT, 2D histopathology and, as a non-medical example, 2D aerial images. Based on metrics including Dice and target registration errors on anatomical structures, the proposed registration outperforms both intensity-based iterative algorithms and learning-based DDF-predicting networks, even yielding competitive performance with weakly-supervised approaches that require fully-segmented training data.
Related papers
- Robust and Accurate Multi-view 2D/3D Image Registration with Differentiable X-ray Rendering and Dual Cross-view Constraints [45.57808049168089]
We propose a novel multi-view 2D/3D rigid registration approach comprising two stages.<n>In the first stage, a combined loss function is designed, incorporating both the differences between predicted and ground-truth poses.<n>In the second stage, test-time optimization is performed to refine the estimated poses from the coarse stage.
arXiv Detail & Related papers (2025-06-27T12:57:58Z) - Tell2Reg: Establishing spatial correspondence between images by the same language prompts [7.064676360230362]
We show that a corresponding region pair can be predicted by the same language prompt on two different images.<n>This enables a fully automated and training-free registration algorithm.<n>Tell2Reg is training-free, eliminating the need for costly and time-consuming data curation and labelling.
arXiv Detail & Related papers (2025-02-05T12:25:02Z) - SketchYourSeg: Mask-Free Subjective Image Segmentation via Freehand Sketches [116.1810651297801]
SketchYourSeg establishes freehand sketches as a powerful query modality for subjective image segmentation.<n>Our evaluations demonstrate superior performance over existing approaches across diverse benchmarks.
arXiv Detail & Related papers (2025-01-27T13:07:51Z) - SAMReg: SAM-enabled Image Registration with ROI-based Correspondence [12.163299991979574]
This paper describes a new spatial correspondence representation based on paired regions-of-interest (ROIs) for medical image registration.
We develop a new registration algorithm SAMReg, which does not require any training (or training data), gradient-based fine-tuning or prompt engineering.
The proposed methods outperform both intensity-based iterative algorithms and DDF-predicting learning-based networks across tested metrics.
arXiv Detail & Related papers (2024-10-17T23:23:48Z) - One registration is worth two segmentations [12.163299991979574]
The goal of image registration is to establish spatial correspondence between two or more images.
We propose an alternative but more intuitive correspondence representation: a set of corresponding regions-of-interest (ROI) pairs.
We experimentally show that the proposed SAMReg is capable of segmenting and matching multiple ROI pairs.
arXiv Detail & Related papers (2024-05-17T16:14:32Z) - Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology
Report Generation [48.723504098917324]
We propose an Unify, Align and then Refine (UAR) approach to learn multi-level cross-modal alignments.
We introduce three novel modules: Latent Space Unifier, Cross-modal Representation Aligner and Text-to-Image Refiner.
Experiments and analyses on IU-Xray and MIMIC-CXR benchmark datasets demonstrate the superiority of our UAR against varied state-of-the-art methods.
arXiv Detail & Related papers (2023-03-28T12:42:12Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - CorrI2P: Deep Image-to-Point Cloud Registration via Dense Correspondence [51.91791056908387]
We propose the first feature-based dense correspondence framework for addressing the image-to-point cloud registration problem, dubbed CorrI2P.
Specifically, given a pair of a 2D image before a 3D point cloud, we first transform them into high-dimensional feature space feed the features into a symmetric overlapping region to determine the region where the image point cloud overlap.
arXiv Detail & Related papers (2022-07-12T11:49:31Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D
Multi-Modal Image Alignment of Large-scale Pathological CT Scans [27.180136688977512]
We propose a novel multi-task learning system, JSSR, based on an end-to-end 3D convolutional neural network.
The system is optimized to satisfy the implicit constraints between different tasks in an unsupervised manner.
It consistently outperforms conventional state-of-the-art multi-modal registration methods.
arXiv Detail & Related papers (2020-05-25T16:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.