Related papers: 3D Wavelet Latent Diffusion Model for Whole-Body MR-to-CT Modality Translation

3D Wavelet Latent Diffusion Model for Whole-Body MR-to-CT Modality Translation

URL: http://arxiv.org/abs/2507.11557v1
Date: Mon, 14 Jul 2025 06:17:05 GMT
Title: 3D Wavelet Latent Diffusion Model for Whole-Body MR-to-CT Modality Translation
Authors: Jiaxu Zheng, Meiman He, Xuhui Tang, Xiong Wang, Tuoyu Cao, Tianyi Zeng, Lichi Zhang, Chenyu You,
Abstract summary: Existing MR-to-CT methods for whole-body imaging often suffer from poor spatial alignment between the generated CT and input MR images.<n>We present a novel 3D Wavelet Latent Diffusion Model (3D-WLDM) that addresses these limitations.<n>By incorporating a Wavelet Residual Module into the encoder-decoder architecture, we enhance the capture and reconstruction of fine-scale features across image and latent spaces.
Score: 13.252652406393205
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Magnetic Resonance (MR) imaging plays an essential role in contemporary clinical diagnostics. It is increasingly integrated into advanced therapeutic workflows, such as hybrid Positron Emission Tomography/Magnetic Resonance (PET/MR) imaging and MR-only radiation therapy. These integrated approaches are critically dependent on accurate estimation of radiation attenuation, which is typically facilitated by synthesizing Computed Tomography (CT) images from MR scans to generate attenuation maps. However, existing MR-to-CT synthesis methods for whole-body imaging often suffer from poor spatial alignment between the generated CT and input MR images, and insufficient image quality for reliable use in downstream clinical tasks. In this paper, we present a novel 3D Wavelet Latent Diffusion Model (3D-WLDM) that addresses these limitations by performing modality translation in a learned latent space. By incorporating a Wavelet Residual Module into the encoder-decoder architecture, we enhance the capture and reconstruction of fine-scale features across image and latent spaces. To preserve anatomical integrity during the diffusion process, we disentangle structural and modality-specific characteristics and anchor the structural component to prevent warping. We also introduce a Dual Skip Connection Attention mechanism within the diffusion model, enabling the generation of high-resolution CT images with improved representation of bony structures and soft-tissue contrast.

Related papers

Latent Space Consistency for Sparse-View CT Reconstruction [10.057432803124167]
Latent Diffusion Model (LDM) has demonstrated promising potential in the domain of 3D CT reconstruction.<n>Cross-modal feature contrastive learning is used to efficiently extract latent 3D information from 2D Xray images.<n>Results indicate that CLS-DM outperforms classical and state-of-the-art generative models in terms of standard voxel-level metrics.
arXiv Detail & Related papers (2025-07-15T10:02:19Z)
JSover: Joint Spectrum Estimation and Multi-Material Decomposition from Single-Energy CT Projections [45.14515691206885]
Multi-material decomposition (MMD) enables quantitative reconstruction of tissue compositions in the human body.<n>Traditional MMD typically requires spectral CT scanners and pre-measured X-ray energy spectra, significantly limiting clinical applicability.<n>This paper proposes JSover, a fundamentally reformulated one-step SEMMD framework that jointly reconstructs multi-material compositions and estimates the energy spectrum directly from SECT projections.
arXiv Detail & Related papers (2025-05-12T23:32:21Z)
ZECO: ZeroFusion Guided 3D MRI Conditional Generation [11.645873358288648]
ZECO is a ZeroFusion guided 3D MRI conditional generation framework.<n>It extracts, compresses, and generates high-fidelity MRI images with corresponding 3D segmentation masks.<n>ZECO outperforms state-of-the-art models in both quantitative and qualitative evaluations on Brain MRI datasets.
arXiv Detail & Related papers (2025-03-24T00:04:52Z)
3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation [47.701856217173244]
3D Medical Diffusion (3D MedDiffusion) model for controllable, high-quality 3D medical image generation.<n>3D MedDiffusion incorporates a novel, highly efficient Patch-Volume Autoencoder that compresses medical images into latent space through patch-wise encoding.<n>We show that 3D MedDiffusion surpasses state-of-the-art methods in generative quality and exhibits strong generalizability across tasks such as sparse-view CT reconstruction, fast MRI reconstruction, and data augmentation.
arXiv Detail & Related papers (2024-12-17T16:25:40Z)
Two-Stage Approach for Brain MR Image Synthesis: 2D Image Synthesis and 3D Refinement [1.5683566370372715]
It is crucial to synthesize the missing MR images that reflect the unique characteristics of the absent modality with precise tumor representation.<n>We propose a two-stage approach that first synthesizes MR images from 2D slices using a novel intensity encoding method and then refines the synthesized MRI.
arXiv Detail & Related papers (2024-10-14T08:21:08Z)
FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram Inpainting [14.043383277622874]
We propose FCDM, a physics-guided, frequency-aware sinogram inpainting framework.<n>It integrates bidirectional frequency-domain convolutions to disentangle overlapping features while enforcing total absorption and frequency-domain consistency via a physics-informed loss.<n>Experiments on synthetic and real-world datasets show that FCDM outperforms existing methods, achieving SSIM over 0.95 and PSNR above 30 dB, with up to 33% and 29% improvements over baselines.
arXiv Detail & Related papers (2024-08-26T12:31:38Z)
Unpaired Volumetric Harmonization of Brain MRI with Conditional Latent Diffusion [13.563413478006954]
We propose a novel 3D MRI Harmonization framework through Conditional Latent Diffusion (HCLD) It comprises a generalizable 3D autoencoder that encodes and decodes MRIs through a 4D latent space. HCLD learns the latent distribution and generates harmonized MRIs with anatomical information from source MRIs while conditioned on target image style.
arXiv Detail & Related papers (2024-08-18T00:13:48Z)
SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification. The proposed framework has been validated through comprehensive experiments on two clinical datasets. To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z)
Volumetric Reconstruction Resolves Off-Resonance Artifacts in Static and Dynamic PROPELLER MRI [76.60362295758596]
Off-resonance artifacts in magnetic resonance imaging (MRI) are visual distortions that occur when the actual resonant frequencies of spins within the imaging volume differ from the expected frequencies used to encode spatial information. We propose to resolve these artifacts by lifting the 2D MRI reconstruction problem to 3D, introducing an additional "spectral" dimension to model this off-resonance.
arXiv Detail & Related papers (2023-11-22T05:44:51Z)
X-Ray2EM: Uncertainty-Aware Cross-Modality Image Reconstruction from X-Ray to Electron Microscopy in Connectomics [55.6985304397137]
We propose an uncertainty-aware 3D reconstruction model that translates X-ray images to EM-like images with enhanced membrane segmentation quality. This shows its potential for developing simpler, faster, and more accurate X-ray based connectomics pipelines.
arXiv Detail & Related papers (2023-03-02T00:52:41Z)
Negligible effect of brain MRI data preprocessing for tumor segmentation [36.89606202543839]
We conduct experiments on three publicly available datasets and evaluate the effect of different preprocessing steps in deep neural networks. Our results demonstrate that most popular standardization steps add no value to the network performance. We suggest that image intensity normalization approaches do not contribute to model accuracy because of the reduction of signal variance with image standardization.
arXiv Detail & Related papers (2022-04-11T17:29:36Z)
Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality. In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network. Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z)
Frequency-Supervised MR-to-CT Image Synthesis [23.47506325756089]
This paper strives to generate a synthetic computed tomography (CT) image from a magnetic resonance (MR) image. We find that all existing approaches share a common limitation: reconstruction breaks down in and around the high-frequency parts of CT images. We introduce frequency-supervised deep networks to explicitly enhance high-frequency MR-to-CT image reconstruction.
arXiv Detail & Related papers (2021-07-19T15:18:36Z)
Tattoo tomography: Freehand 3D photoacoustic image reconstruction with an optical pattern [49.240017254888336]
Photoacoustic tomography (PAT) is a novel imaging technique that can resolve both morphological and functional tissue properties. A current drawback is the limited field-of-view provided by the conventionally applied 2D probes. We present a novel approach to 3D reconstruction of PAT data that does not require an external tracking system.
arXiv Detail & Related papers (2020-11-10T09:27:56Z)
Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion Encoding (SIDE) [50.65891535040752]
We propose a diffusion encoding scheme, called Slice-Interleaved Diffusion. SIDE, that interleaves each diffusion-weighted (DW) image volume with slices encoded with different diffusion gradients. We also present a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data.
arXiv Detail & Related papers (2020-02-25T14:48:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.