Unsupervised Multi-Modal Image Registration via Geometry Preserving
Image-to-Image Translation
- URL: http://arxiv.org/abs/2003.08073v1
- Date: Wed, 18 Mar 2020 07:21:09 GMT
- Title: Unsupervised Multi-Modal Image Registration via Geometry Preserving
Image-to-Image Translation
- Authors: Moab Arar, Yiftach Ginger, Dov Danon, Ilya Leizerson, Amit Bermano,
Daniel Cohen-Or
- Abstract summary: We train an image-to-image translation network on the two input modalities.
This learned translation allows training the registration network using simple and reliable mono-modality metrics.
Compared to state-of-the-art multi-modal methods our presented method is unsupervised, requiring no pairs of aligned modalities for training, and can be adapted to any pair of modalities.
- Score: 43.060971647266236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many applications, such as autonomous driving, heavily rely on multi-modal
data where spatial alignment between the modalities is required. Most
multi-modal registration methods struggle computing the spatial correspondence
between the images using prevalent cross-modality similarity measures. In this
work, we bypass the difficulties of developing cross-modality similarity
measures, by training an image-to-image translation network on the two input
modalities. This learned translation allows training the registration network
using simple and reliable mono-modality metrics. We perform multi-modal
registration using two networks - a spatial transformation network and a
translation network. We show that by encouraging our translation network to be
geometry preserving, we manage to train an accurate spatial transformation
network. Compared to state-of-the-art multi-modal methods our presented method
is unsupervised, requiring no pairs of aligned modalities for training, and can
be adapted to any pair of modalities. We evaluate our method quantitatively and
qualitatively on commercial datasets, showing that it performs well on several
modalities and achieves accurate alignment.
Related papers
- Cross-domain and Cross-dimension Learning for Image-to-Graph
Transformers [50.576354045312115]
Direct image-to-graph transformation is a challenging task that solves object detection and relationship prediction in a single model.
We introduce a set of methods enabling cross-domain and cross-dimension transfer learning for image-to-graph transformers.
We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we pretrain our models on 2D satellite images before applying them to vastly different target domains in 2D and 3D.
arXiv Detail & Related papers (2024-03-11T10:48:56Z) - MAD: Modality Agnostic Distance Measure for Image Registration [14.558286801723293]
Multi-modal image registration is a crucial pre-processing step in many medical applications.
We present Modality Agnostic Distance (MAD), a measure that uses random convolutions to learn the inherent geometry of the images.
We demonstrate that not only can MAD affinely register multi-modal images successfully, but it has also a larger capture range than traditional measures.
arXiv Detail & Related papers (2023-09-06T09:59:58Z) - Unsupervised Multi-Modal Medical Image Registration via
Discriminator-Free Image-to-Image Translation [4.43142018105102]
We propose a novel translation-based unsupervised deformable image registration approach to convert the multi-modal registration problem to a mono-modal one.
Our approach incorporates a discriminator-free translation network to facilitate the training of the registration network and a patchwise contrastive loss to encourage the translation network to preserve object shapes.
arXiv Detail & Related papers (2022-04-28T17:18:21Z) - Multi-modal unsupervised brain image registration using edge maps [7.49320945341034]
We propose a simple yet effective unsupervised deep learning-based em multi-modal image registration approach.
The intuition behind this is that image locations with a strong gradient are assumed to denote a transition of tissues.
We evaluate our approach in the context of registering multi-modal (T1w to T2w) magnetic resonance (MR) brain images of different subjects using three different loss functions.
arXiv Detail & Related papers (2022-02-09T15:50:14Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z) - StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis [68.3787368024951]
We propose a novel approach for multi-modal Image-to-image (I2I) translation.
We learn a latent embedding, jointly with the generator, that models the variability of the output domain.
Specifically, we pre-train a generic style encoder using a novel proxy task to learn an embedding of images, from arbitrary domains, into a low-dimensional style latent space.
arXiv Detail & Related papers (2021-04-14T19:58:24Z) - Is Image-to-Image Translation the Panacea for Multimodal Image
Registration? A Comparative Study [4.00906288611816]
We conduct an empirical study of the applicability of modern I2I translation methods for the task of multimodal biomedical image registration.
We compare the performance of four Generative Adrial Network (GAN)-based methods and one contrastive representation learning method.
Our results suggest that, although I2I translation may be helpful when the modalities to register are clearly correlated, registration of modalities which express distinctly different properties of the sample are not well handled by the I2I translation approach.
arXiv Detail & Related papers (2021-03-30T11:28:21Z) - TSIT: A Simple and Versatile Framework for Image-to-Image Translation [103.92203013154403]
We introduce a simple and versatile framework for image-to-image translation.
We provide a carefully designed two-stream generative model with newly proposed feature transformations.
This allows multi-scale semantic structure information and style representation to be effectively captured and fused by the network.
A systematic study compares the proposed method with several state-of-the-art task-specific baselines, verifying its effectiveness in both perceptual quality and quantitative evaluations.
arXiv Detail & Related papers (2020-07-23T15:34:06Z) - Adversarial Uni- and Multi-modal Stream Networks for Multimodal Image
Registration [20.637787406888478]
Deformable image registration between Computed Tomography (CT) images and Magnetic Resonance (MR) imaging is essential for many image-guided therapies.
In this paper, we propose a novel translation-based unsupervised deformable image registration method.
Our method has been evaluated on two clinical datasets and demonstrates promising results compared to state-of-the-art traditional and learning-based methods.
arXiv Detail & Related papers (2020-07-06T14:44:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.