Related papers: MAD: Modality Agnostic Distance Measure for Image Registration

MAD: Modality Agnostic Distance Measure for Image Registration

URL: http://arxiv.org/abs/2309.02875v1
Date: Wed, 6 Sep 2023 09:59:58 GMT
Title: MAD: Modality Agnostic Distance Measure for Image Registration
Authors: Vasiliki Sideri-Lampretsa, Veronika A. Zimmer, Huaqi Qiu, Georgios Kaissis, and Daniel Rueckert
Abstract summary: Multi-modal image registration is a crucial pre-processing step in many medical applications. We present Modality Agnostic Distance (MAD), a measure that uses random convolutions to learn the inherent geometry of the images. We demonstrate that not only can MAD affinely register multi-modal images successfully, but it has also a larger capture range than traditional measures.
Score: 14.558286801723293
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-modal image registration is a crucial pre-processing step in many medical applications. However, it is a challenging task due to the complex intensity relationships between different imaging modalities, which can result in large discrepancy in image appearance. The success of multi-modal image registration, whether it is conventional or learning based, is predicated upon the choice of an appropriate distance (or similarity) measure. Particularly, deep learning registration algorithms lack in accuracy or even fail completely when attempting to register data from an "unseen" modality. In this work, we present Modality Agnostic Distance (MAD), a deep image distance}] measure that utilises random convolutions to learn the inherent geometry of the images while being robust to large appearance changes. Random convolutions are geometry-preserving modules which we use to simulate an infinite number of synthetic modalities alleviating the need for aligned paired data during training. We can therefore train MAD on a mono-modal dataset and successfully apply it to a multi-modal dataset. We demonstrate that not only can MAD affinely register multi-modal images successfully, but it has also a larger capture range than traditional measures such as Mutual Information and Normalised Gradient Fields.

Related papers

OSDM-MReg: Multimodal Image Registration based One Step Diffusion Model [8.619958921346184]
Multimodal remote sensing image registration aligns images from different sensors for data fusion and analysis. We propose OSDM-MReg, a novel multimodal image registration framework based image-to-image translation. Experiments demonstrate superior accuracy and efficiency across various multimodal registration tasks.
arXiv Detail & Related papers (2025-04-08T13:32:56Z)
MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching [54.740256498985026]
Keypoint detection and description methods often struggle with multimodal data. We propose a modality-invariant feature learning network (MIFNet) to compute modality-invariant features for keypoint descriptions in multimodal image matching.
arXiv Detail & Related papers (2025-01-20T06:56:30Z)
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training [62.843316348659165]
Deep learning-based image matching algorithms have dramatically outperformed humans in rapidly and accurately finding large amounts of correspondences. We propose a large-scale pre-training framework that utilizes synthetic cross-modal training signals to train models to recognize and match fundamental structures across images. Our key finding is that the matching model trained with our framework achieves remarkable generalizability across more than eight unseen cross-modality registration tasks.
arXiv Detail & Related papers (2025-01-13T18:37:36Z)
MINIMA: Modality Invariant Image Matching [52.505282811925454]
We present MINIMA, a unified image matching framework for multiple cross-modal cases. We scale up the modalities from cheap but rich RGB-only matching data, by means of generative models. With MD-syn, we can directly train any advanced matching pipeline on randomly selected modality pairs to obtain cross-modal ability.
arXiv Detail & Related papers (2024-12-27T02:39:50Z)
Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task. MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities. We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z)
MsMorph: An Unsupervised pyramid learning network for brain image registration [4.000367245594772]
MsMorph is an image registration framework aimed at mimicking the manual process of registering image pairs. It decodes semantic information at different scales and continuously compen-sates for the predicted deformation field. The proposed method simulates the manual approach to registration, focusing on different regions of the image pairs and their neighborhoods.
arXiv Detail & Related papers (2024-10-23T19:20:57Z)
Large Language Models for Multimodal Deformable Image Registration [50.91473745610945]
We propose a novel coarse-to-fine MDIR framework,LLM-Morph, for aligning the deep features from different modal medical images. Specifically, we first utilize a CNN encoder to extract deep visual features from cross-modal image pairs, then we use the first adapter to adjust these tokens, and use LoRA in pre-trained LLMs to fine-tune their weights. Third, for the alignment of tokens, we utilize other four adapters to transform the LLM-encoded tokens into multi-scale visual features, generating multi-scale deformation fields and facilitating the coarse-to-fine MDIR task
arXiv Detail & Related papers (2024-08-20T09:58:30Z)
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training [103.72844619581811]
We build performant Multimodal Large Language Models (MLLMs) In particular, we study the importance of various architecture components and data choices. We demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data.
arXiv Detail & Related papers (2024-03-14T17:51:32Z)
Learning Multimodal Data Augmentation in Feature Space [65.54623807628536]
LeMDA is an easy-to-use method that automatically learns to jointly augment multimodal data in feature space. We show that LeMDA can profoundly improve the performance of multimodal deep learning architectures.
arXiv Detail & Related papers (2022-12-29T20:39:36Z)
Multi-scale Transformer Network with Edge-aware Pre-training for Cross-Modality MR Image Synthesis [52.41439725865149]
Cross-modality magnetic resonance (MR) image synthesis can be used to generate missing modalities from given ones. Existing (supervised learning) methods often require a large number of paired multi-modal data to train an effective synthesis model. We propose a Multi-scale Transformer Network (MT-Net) with edge-aware pre-training for cross-modality MR image synthesis.
arXiv Detail & Related papers (2022-12-02T11:40:40Z)
Unsupervised Multi-Modal Medical Image Registration via Discriminator-Free Image-to-Image Translation [4.43142018105102]
We propose a novel translation-based unsupervised deformable image registration approach to convert the multi-modal registration problem to a mono-modal one. Our approach incorporates a discriminator-free translation network to facilitate the training of the registration network and a patchwise contrastive loss to encourage the translation network to preserve object shapes.
arXiv Detail & Related papers (2022-04-28T17:18:21Z)
Multi-modal unsupervised brain image registration using edge maps [7.49320945341034]
We propose a simple yet effective unsupervised deep learning-based em multi-modal image registration approach. The intuition behind this is that image locations with a strong gradient are assumed to denote a transition of tissues. We evaluate our approach in the context of registering multi-modal (T1w to T2w) magnetic resonance (MR) brain images of different subjects using three different loss functions.
arXiv Detail & Related papers (2022-02-09T15:50:14Z)
Deep Group-wise Variational Diffeomorphic Image Registration [3.0022455491411653]
We propose to extend current learning-based image registration to allow simultaneous registration of multiple images. We present a general mathematical framework that enables both registration of multiple images to their viscous geodesic average and registration in which any of the available images can be used as a fixed image.
arXiv Detail & Related papers (2020-10-01T07:37:28Z)
CoMIR: Contrastive Multimodal Image Representation for Registration [4.543268895439618]
We propose contrastive coding to learn shared, dense image representations, referred to as CoMIRs (Contrastive Multimodal Image Representations) CoMIRs enable the registration of multimodal images where existing registration methods often fail due to a lack of sufficiently similar image structures.
arXiv Detail & Related papers (2020-06-11T10:51:33Z)
Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation. We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.