Guiding Registration with Emergent Similarity from Pre-Trained Diffusion Models
- URL: http://arxiv.org/abs/2506.02419v1
- Date: Tue, 03 Jun 2025 04:07:04 GMT
- Title: Guiding Registration with Emergent Similarity from Pre-Trained Diffusion Models
- Authors: Nurislam Tursynbek, Hastings Greer, Basar Demir, Marc Niethammer,
- Abstract summary: We find that off-the-shelf diffusion models, trained exclusively to generate natural RGB images, can identify semantically meaningful correspondences in medical images.<n>We propose to leverage diffusion model features as a similarity measure to guide deformable image registration networks.
- Score: 16.2208278847136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models, while trained for image generation, have emerged as powerful foundational feature extractors for downstream tasks. We find that off-the-shelf diffusion models, trained exclusively to generate natural RGB images, can identify semantically meaningful correspondences in medical images. Building on this observation, we propose to leverage diffusion model features as a similarity measure to guide deformable image registration networks. We show that common intensity-based similarity losses often fail in challenging scenarios, such as when certain anatomies are visible in one image but absent in another, leading to anatomically inaccurate alignments. In contrast, our method identifies true semantic correspondences, aligning meaningful structures while disregarding those not present across images. We demonstrate superior performance of our approach on two tasks: multimodal 2D registration (DXA to X-Ray) and monomodal 3D registration (brain-extracted to non-brain-extracted MRI). Code: https://github.com/uncbiag/dgir
Related papers
- Reference-Guided Diffusion Inpainting For Multimodal Counterfactual Generation [55.2480439325792]
Safety-critical applications, such as autonomous driving and medical image analysis, require extensive multimodal data for rigorous testing.<n>This work introduces two novel methods for synthetic data generation in autonomous driving and medical image analysis, namely MObI and AnydoorMed, respectively.
arXiv Detail & Related papers (2025-07-30T19:43:47Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.<n>We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - LDM-Morph: Latent diffusion model guided deformable image registration [2.8195553455247317]
We propose LDM-Morph, an unsupervised deformable registration algorithm for medical image registration.
LDM-Morph integrated features extracted from the latent diffusion model (LDM) to enrich the semantic information.
Extensive experiments on four public 2D cardiac image datasets show that the proposed LDM-Morph framework outperformed existing state-of-the-art CNNs- and Transformers-based registration methods.
arXiv Detail & Related papers (2024-11-23T03:04:36Z) - Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task.
MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities.
We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z) - MsMorph: An Unsupervised pyramid learning network for brain image registration [4.000367245594772]
MsMorph is an image registration framework aimed at mimicking the manual process of registering image pairs.
It decodes semantic information at different scales and continuously compen-sates for the predicted deformation field.
The proposed method simulates the manual approach to registration, focusing on different regions of the image pairs and their neighborhoods.
arXiv Detail & Related papers (2024-10-23T19:20:57Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - ContourDiff: Unpaired Image-to-Image Translation with Structural Consistency for Medical Imaging [14.487188068402178]
We introduce a novel metric to quantify the structural bias between domains which must be considered for proper translation.
We then propose ContourDiff, a novel image-to-image translation algorithm that leverages domain-invariant anatomical contour representations.
We evaluate our method on challenging lumbar spine and hip-and-thigh CT-to-MRI translation tasks.
arXiv Detail & Related papers (2024-03-16T03:33:52Z) - Introducing Shape Prior Module in Diffusion Model for Medical Image
Segmentation [7.7545714516743045]
We propose an end-to-end framework called VerseDiff-UNet, which leverages the denoising diffusion probabilistic model (DDPM)
Our approach integrates the diffusion model into a standard U-shaped architecture.
We evaluate our method on a single dataset of spine images acquired through X-ray imaging.
arXiv Detail & Related papers (2023-09-12T03:05:00Z) - Modality Cycles with Masked Conditional Diffusion for Unsupervised
Anomaly Segmentation in MRI [2.5847188023177403]
Unsupervised anomaly segmentation aims to detect patterns that are distinct from any patterns processed during training.
This paper introduces Masked Modality Cycles with Conditional Diffusion (MMCCD), a method that enables segmentation of anomalies across diverse patterns in multimodal MRI.
We show that our method compares favorably to previous unsupervised approaches based on image reconstruction and denoising with autoencoders and diffusion models.
arXiv Detail & Related papers (2023-08-30T17:16:02Z) - Diffusion Models for Counterfactual Generation and Anomaly Detection in Brain Images [39.94162291765236]
We present a weakly supervised method to generate a healthy version of a diseased image and then use it to obtain a pixel-wise anomaly map.
We employ a diffusion model trained on healthy samples and combine Denoising Diffusion Probabilistic Model (DDPM) and Denoising Implicit Model (DDIM) at each step of the sampling process.
arXiv Detail & Related papers (2023-08-03T21:56:50Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - DiffuseMorph: Unsupervised Deformable Image Registration Along
Continuous Trajectory Using Diffusion Models [31.826844124173984]
We present a novel approach of diffusion model-based probabilistic image registration, called DiffuseMorph.
Our model learns the score function of the deformation between moving and fixed images.
Our method can provide flexible and accurate deformation with a capability of topology preservation.
arXiv Detail & Related papers (2021-12-09T08:41:23Z) - Cross-Modal Contrastive Learning for Abnormality Classification and
Localization in Chest X-rays with Radiomics using a Feedback Loop [63.81818077092879]
We propose an end-to-end semi-supervised cross-modal contrastive learning framework for medical images.
We first apply an image encoder to classify the chest X-rays and to generate the image features.
The radiomic features are then passed through another dedicated encoder to act as the positive sample for the image features generated from the same chest X-ray.
arXiv Detail & Related papers (2021-04-11T09:16:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.