Mono-Modalizing Extremely Heterogeneous Multi-Modal Medical Image Registration
- URL: http://arxiv.org/abs/2506.15596v2
- Date: Mon, 30 Jun 2025 07:41:55 GMT
- Title: Mono-Modalizing Extremely Heterogeneous Multi-Modal Medical Image Registration
- Authors: Kyobin Choo, Hyunkyung Han, Jinyeong Kim, Chanyong Yoon, Seong Jae Hwang,
- Abstract summary: M2M-Reg is a novel framework that trains multi-modal DIR models using only mono-modal similarity.<n>M2M-Reg achieves up to 2x higher than prior methods for PET-MRI and FA-MRI registration.
- Score: 3.1923251959845214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In clinical practice, imaging modalities with functional characteristics, such as positron emission tomography (PET) and fractional anisotropy (FA), are often aligned with a structural reference (e.g., MRI, CT) for accurate interpretation or group analysis, necessitating multi-modal deformable image registration (DIR). However, due to the extreme heterogeneity of these modalities compared to standard structural scans, conventional unsupervised DIR methods struggle to learn reliable spatial mappings and often distort images. We find that the similarity metrics guiding these models fail to capture alignment between highly disparate modalities. To address this, we propose M2M-Reg (Multi-to-Mono Registration), a novel framework that trains multi-modal DIR models using only mono-modal similarity while preserving the established architectural paradigm for seamless integration into existing models. We also introduce GradCyCon, a regularizer that leverages M2M-Reg's cyclic training scheme to promote diffeomorphism. Furthermore, our framework naturally extends to a semi-supervised setting, integrating pre-aligned and unaligned pairs only, without requiring ground-truth transformations or segmentation masks. Experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset demonstrate that M2M-Reg achieves up to 2x higher DSC than prior methods for PET-MRI and FA-MRI registration, highlighting its effectiveness in handling highly heterogeneous multi-modal DIR. Our code is available at https://github.com/MICV-yonsei/M2M-Reg.
Related papers
- Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - Bridging Modalities: Joint Synthesis and Registration Framework for Aligning Diffusion MRI with T1-Weighted Images [8.802336586929613]
This paper proposes an unsupervised registration framework based on a generative registration network.<n>It transforms the original multimodal registration problem between b0 and T1w images into a unimodal registration task between a generated image and the real T1w image.<n> Experiments conducted on two independent datasets demonstrate that the proposed method outperforms several state-of-the-art approaches in multimodal registration tasks.
arXiv Detail & Related papers (2026-01-16T13:22:21Z) - Unified Multi-Site Multi-Sequence Brain MRI Harmonization Enriched by Biomedical Semantic Style [13.422551989566983]
We introduce MMH, a unified framework for multi-site multi-sequence brain MRI harmonization.<n>MMH operates in two stages: (1) a diffusion-based global harmonizer that maps MR images to a sequence-specific unified domain using style-agnostic gradient conditioning, and (2) a target-specific fine-tuner that adapts globally aligned images to desired target domains.<n> Evaluations on 4,163 T1- and T2-weighted MRIs demonstrate MMH's superior harmonization over state-of-the-art methods in image feature clustering, voxel-level comparison, tissue segmentation, and downstream age and site
arXiv Detail & Related papers (2026-01-13T03:47:23Z) - SLaM-DiMM: Shared Latent Modeling for Diffusion Based Missing Modality Synthesis in MRI [0.0]
Brain MRI scans are often found in four modalities, consisting of T1-weighted with and without contrast enhancement (T1ce and T1w), T2-weighted imaging (T2w) and Flair.<n>We propose SLaM-DiMM, a novel missing modality generation framework that harnesses the power of diffusion models to synthesize any of the four target MRI modalities from other available modalities.
arXiv Detail & Related papers (2025-09-19T14:27:35Z) - MDPG: Multi-domain Diffusion Prior Guidance for MRI Reconstruction [0.4893345190925178]
We propose Multi-domain Diffusion Prior Guidance (MDPG) to enhance data consistency in MRI reconstruction tasks.<n> Specifically, we first construct a Visual-Mamba-based backbone, which enables efficient encoding and reconstruction of under-sampled images.<n>A novel Latent Guided Attention (LGA) is proposed for efficient fusion in multi-level latent domains.
arXiv Detail & Related papers (2025-06-30T10:25:08Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - Large Language Models for Multimodal Deformable Image Registration [50.91473745610945]
We propose a novel coarse-to-fine MDIR framework,LLM-Morph, for aligning the deep features from different modal medical images.
Specifically, we first utilize a CNN encoder to extract deep visual features from cross-modal image pairs, then we use the first adapter to adjust these tokens, and use LoRA in pre-trained LLMs to fine-tune their weights.
Third, for the alignment of tokens, we utilize other four adapters to transform the LLM-encoded tokens into multi-scale visual features, generating multi-scale deformation fields and facilitating the coarse-to-fine MDIR task
arXiv Detail & Related papers (2024-08-20T09:58:30Z) - Enhanced Masked Image Modeling to Avoid Model Collapse on Multi-modal MRI Datasets [6.3467517115551875]
Masked image modeling (MIM) has shown promise in utilizing unlabeled data.<n>We analyze and address model collapse in two types: complete collapse and dimensional collapse.<n>We construct the enhanced MIM (E-MIM) with HMP and PBT module to avoid model collapse multi-modal MRI.
arXiv Detail & Related papers (2024-07-15T01:11:30Z) - NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals.
By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z) - DuDoUniNeXt: Dual-domain unified hybrid model for single and
multi-contrast undersampled MRI reconstruction [24.937435059755288]
We propose DuDoUniNeXt, a unified dual-domain MRI reconstruction network that can accommodate to scenarios involving absent, low-quality, and high-quality reference images.
Experimental results demonstrate that the proposed model surpasses state-of-the-art SC and MC models significantly.
arXiv Detail & Related papers (2024-03-08T12:26:48Z) - Explainable unsupervised multi-modal image registration using deep
networks [2.197364252030876]
MRI image registration aims to geometrically 'pair' diagnoses from different modalities, time points and slices.
In this work, we show that our DL model becomes fully explainable, setting the framework to generalise our approach on further medical imaging data.
arXiv Detail & Related papers (2023-08-03T19:13:48Z) - Model-Guided Multi-Contrast Deep Unfolding Network for MRI
Super-resolution Reconstruction [68.80715727288514]
We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix.
In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction.
arXiv Detail & Related papers (2022-09-15T03:58:30Z) - Transformer-empowered Multi-scale Contextual Matching and Aggregation
for Multi-contrast MRI Super-resolution [55.52779466954026]
Multi-contrast super-resolution (SR) reconstruction is promising to yield SR images with higher quality.
Existing methods lack effective mechanisms to match and fuse these features for better reconstruction.
We propose a novel network to address these problems by developing a set of innovative Transformer-empowered multi-scale contextual matching and aggregation techniques.
arXiv Detail & Related papers (2022-03-26T01:42:59Z) - Unsupervised Image Registration Towards Enhancing Performance and
Explainability in Cardiac And Brain Image Analysis [3.5718941645696485]
Inter- and intra-modality affine and non-rigid image registration is an essential medical image analysis process in clinical imaging.
We present an un-supervised deep learning registration methodology which can accurately model affine and non-rigid trans-formations.
Our methodology performs bi-directional cross-modality image synthesis to learn modality-invariant latent rep-resentations.
arXiv Detail & Related papers (2022-03-07T12:54:33Z) - Modality Completion via Gaussian Process Prior Variational Autoencoders
for Multi-Modal Glioma Segmentation [75.58395328700821]
We propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
MGP-VAE can leverage the Gaussian Process (GP) prior on the Variational Autoencoder (VAE) to utilize the subjects/patients and sub-modalities correlations.
We show the applicability of MGP-VAE on brain tumor segmentation where either, two, or three of four sub-modalities may be missing.
arXiv Detail & Related papers (2021-07-07T19:06:34Z) - Robust Image Reconstruction with Misaligned Structural Information [0.27074235008521236]
We propose a variational framework which jointly performs reconstruction and registration.
Our approach is the first to achieve this for different modalities and outranks established approaches in terms of accuracy of both reconstruction and registration.
arXiv Detail & Related papers (2020-04-01T17:21:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.