Modality Translation and Registration of MR and Ultrasound Images Using Diffusion Models
- URL: http://arxiv.org/abs/2506.01025v1
- Date: Sun, 01 Jun 2025 14:10:06 GMT
- Title: Modality Translation and Registration of MR and Ultrasound Images Using Diffusion Models
- Authors: Xudong Ma, Nantheera Anantrasirichai, Stefanos Bolomytis, Alin Achim,
- Abstract summary: Multimodal MR-US registration is critical for prostate cancer diagnosis.<n>Existing methods fail to align critical boundaries while being overly sensitive to irrelevant details.<n>We propose an anatomically coherent modality translation network based on a hierarchical feature disentanglement design.
- Score: 7.512221808783586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal MR-US registration is critical for prostate cancer diagnosis. However, this task remains challenging due to significant modality discrepancies. Existing methods often fail to align critical boundaries while being overly sensitive to irrelevant details. To address this, we propose an anatomically coherent modality translation (ACMT) network based on a hierarchical feature disentanglement design. We leverage shallow-layer features for texture consistency and deep-layer features for boundary preservation. Unlike conventional modality translation methods that convert one modality into another, our ACMT introduces the customized design of an intermediate pseudo modality. Both MR and US images are translated toward this intermediate domain, effectively addressing the bottlenecks faced by traditional translation methods in the downstream registration task. Experiments demonstrate that our method mitigates modality-specific discrepancies while preserving crucial anatomical boundaries for accurate registration. Quantitative evaluations show superior modality similarity compared to state-of-the-art modality translation methods. Furthermore, downstream registration experiments confirm that our translated images achieve the best alignment performance, highlighting the robustness of our framework for multi-modal prostate image registration.
Related papers
- Human-Guided Shade Artifact Suppression in CBCT-to-MDCT Translation via Schrödinger Bridge with Conditional Diffusion [1.5869861104370917]
We present a novel framework for CBCT-to-MDCT translation, grounded in the Schrodinger Bridge (SB) formulation.<n>Our approach explicitly enforces boundary consistency between CBCT inputs and pseudo targets, ensuring both anatomical fidelity and perceptual controllability.
arXiv Detail & Related papers (2025-07-15T06:44:53Z) - MR2US-Pro: Prostate MR to Ultrasound Image Translation and Registration Based on Diffusion Models [7.512221808783586]
We present a novel framework that addresses the challenges through a two-stage process: TRUS 3D reconstruction followed by cross-modal registration.<n>We propose a totally probe-location-independent approach that leverages the natural correlation between sagittal and transverse TRUS views.<n>For the registration stage, we introduce an unsupervised diffusion-based framework guided by modality translation.
arXiv Detail & Related papers (2025-05-31T14:55:03Z) - Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion [52.315729095824906]
MLLM Semantic-Corrected Ping-Pong-Ahead Diffusion (PPAD) is a novel framework that introduces a Multimodal Large Language Model (MLLM) as a semantic observer during inference.<n>It performs real-time analysis on intermediate generations, identifies latent semantic inconsistencies, and translates feedback into controllable signals that actively guide the remaining denoising steps.<n>Extensive experiments demonstrate PPAD's significant improvements.
arXiv Detail & Related papers (2025-05-26T14:42:35Z) - Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task.
MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities.
We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z) - CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation [60.61972883059688]
CriDiff is a two-stage feature injecting framework with a Crisscross Injection Strategy (CIS) and a Generative Pre-train (GP) approach for prostate segmentation.
To effectively learn multi-level of edge features and non-edge features, we proposed two parallel conditioners in the CIS.
The GP approach eases the inconsistency between the images features and the diffusion model without adding additional parameters.
arXiv Detail & Related papers (2024-06-20T10:46:50Z) - Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation [26.67518950976257]
We propose a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation.
Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods.
arXiv Detail & Related papers (2024-04-06T03:02:47Z) - ContourDiff: Unpaired Image-to-Image Translation with Structural Consistency for Medical Imaging [14.487188068402178]
We introduce a novel metric to quantify the structural bias between domains which must be considered for proper translation.
We then propose ContourDiff, a novel image-to-image translation algorithm that leverages domain-invariant anatomical contour representations.
We evaluate our method on challenging lumbar spine and hip-and-thigh CT-to-MRI translation tasks.
arXiv Detail & Related papers (2024-03-16T03:33:52Z) - Negligible effect of brain MRI data preprocessing for tumor segmentation [36.89606202543839]
We conduct experiments on three publicly available datasets and evaluate the effect of different preprocessing steps in deep neural networks.
Our results demonstrate that most popular standardization steps add no value to the network performance.
We suggest that image intensity normalization approaches do not contribute to model accuracy because of the reduction of signal variance with image standardization.
arXiv Detail & Related papers (2022-04-11T17:29:36Z) - Marginal Contrastive Correspondence for Guided Image Generation [58.0605433671196]
Exemplar-based image translation establishes dense correspondences between a conditional input and an exemplar from two different domains.
Existing work builds the cross-domain correspondences implicitly by minimizing feature-wise distances across the two domains.
We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.
arXiv Detail & Related papers (2022-04-01T13:55:44Z) - A Deep Discontinuity-Preserving Image Registration Network [73.03885837923599]
Most deep learning-based registration methods assume that the desired deformation fields are globally smooth and continuous.
We propose a weakly-supervised Deep Discontinuity-preserving Image Registration network (DDIR) to obtain better registration performance and realistic deformation fields.
We demonstrate that our method achieves significant improvements in registration accuracy and predicts more realistic deformations, in registration experiments on cardiac magnetic resonance (MR) images.
arXiv Detail & Related papers (2021-07-09T13:35:59Z) - FetReg: Placental Vessel Segmentation and Registration in Fetoscopy
Challenge Dataset [57.30136148318641]
Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS)
This may lead to increased procedural time and incomplete ablation, resulting in persistent TTTS.
Computer-assisted intervention may help overcome these challenges by expanding the fetoscopic field of view through video mosaicking and providing better visualization of the vessel network.
We present a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms for the fetal environment with a focus on creating drift-free mosaics from long duration fetoscopy videos.
arXiv Detail & Related papers (2021-06-10T17:14:27Z) - Unsupervised Multimodal Image Registration with Adaptative Gradient
Guidance [23.461130560414805]
Unsupervised learning-based methods have demonstrated promising performance over accuracy and efficiency in deformable image registration.
The estimated deformation fields of the existing methods fully rely on the to-be-registered image pair.
We propose a novel multimodal registration framework, which leverages the deformation fields estimated from both.
arXiv Detail & Related papers (2020-11-12T05:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.