Is Image-to-Image Translation the Panacea for Multimodal Image
Registration? A Comparative Study
- URL: http://arxiv.org/abs/2103.16262v1
- Date: Tue, 30 Mar 2021 11:28:21 GMT
- Title: Is Image-to-Image Translation the Panacea for Multimodal Image
Registration? A Comparative Study
- Authors: Jiahao Lu, Johan \"Ofverstedt, Joakim Lindblad, Nata\v{s}a Sladoje
- Abstract summary: We conduct an empirical study of the applicability of modern I2I translation methods for the task of multimodal biomedical image registration.
We compare the performance of four Generative Adrial Network (GAN)-based methods and one contrastive representation learning method.
Our results suggest that, although I2I translation may be helpful when the modalities to register are clearly correlated, registration of modalities which express distinctly different properties of the sample are not well handled by the I2I translation approach.
- Score: 4.00906288611816
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Despite current advancement in the field of biomedical image processing,
propelled by the deep learning revolution, multimodal image registration, due
to its several challenges, is still often performed manually by specialists.
The recent success of image-to-image (I2I) translation in computer vision
applications and its growing use in biomedical areas provide a tempting
possibility of transforming the multimodal registration problem into a,
potentially easier, monomodal one. We conduct an empirical study of the
applicability of modern I2I translation methods for the task of multimodal
biomedical image registration. We compare the performance of four Generative
Adversarial Network (GAN)-based methods and one contrastive representation
learning method, subsequently combined with two representative monomodal
registration methods, to judge the effectiveness of modality translation for
multimodal image registration. We evaluate these method combinations on three
publicly available multimodal datasets of increasing difficulty, and compare
with the performance of registration by Mutual Information maximisation and one
modern data-specific multimodal registration method. Our results suggest that,
although I2I translation may be helpful when the modalities to register are
clearly correlated, registration of modalities which express distinctly
different properties of the sample are not well handled by the I2I translation
approach. When less information is shared between the modalities, the I2I
translation methods struggle to provide good predictions, which impairs the
registration performance. The evaluated representation learning method, which
aims to find an in-between representation, manages better, and so does the
Mutual Information maximisation approach. We share our complete experimental
setup as open-source (https://github.com/Noodles-321/Registration).
Related papers
- Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Beyond Triplet: Leveraging the Most Data for Multimodal Machine
Translation [53.342921374639346]
Multimodal machine translation aims to improve translation quality by incorporating information from other modalities, such as vision.
Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets.
This paper establishes new methods and new datasets for MMT.
arXiv Detail & Related papers (2022-12-20T15:02:38Z) - Unsupervised Multi-Modal Medical Image Registration via
Discriminator-Free Image-to-Image Translation [4.43142018105102]
We propose a novel translation-based unsupervised deformable image registration approach to convert the multi-modal registration problem to a mono-modal one.
Our approach incorporates a discriminator-free translation network to facilitate the training of the registration network and a patchwise contrastive loss to encourage the translation network to preserve object shapes.
arXiv Detail & Related papers (2022-04-28T17:18:21Z) - Multi-modal unsupervised brain image registration using edge maps [7.49320945341034]
We propose a simple yet effective unsupervised deep learning-based em multi-modal image registration approach.
The intuition behind this is that image locations with a strong gradient are assumed to denote a transition of tissues.
We evaluate our approach in the context of registering multi-modal (T1w to T2w) magnetic resonance (MR) brain images of different subjects using three different loss functions.
arXiv Detail & Related papers (2022-02-09T15:50:14Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Mutual information neural estimation for unsupervised multi-modal
registration of brain images [0.0]
We propose guiding the training of a deep learning-based registration method with MI estimation between an image-pair in an end-to-end trainable network.
Our results show that a small, 2-layer network produces competitive results in both mono- and multimodal registration, with sub-second run-times.
Real-time clinical application will benefit from a better visual matching of anatomical structures and less registration failures/outliers.
arXiv Detail & Related papers (2022-01-25T13:22:34Z) - StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis [68.3787368024951]
We propose a novel approach for multi-modal Image-to-image (I2I) translation.
We learn a latent embedding, jointly with the generator, that models the variability of the output domain.
Specifically, we pre-train a generic style encoder using a novel proxy task to learn an embedding of images, from arbitrary domains, into a low-dimensional style latent space.
arXiv Detail & Related papers (2021-04-14T19:58:24Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z) - Adversarial Uni- and Multi-modal Stream Networks for Multimodal Image
Registration [20.637787406888478]
Deformable image registration between Computed Tomography (CT) images and Magnetic Resonance (MR) imaging is essential for many image-guided therapies.
In this paper, we propose a novel translation-based unsupervised deformable image registration method.
Our method has been evaluated on two clinical datasets and demonstrates promising results compared to state-of-the-art traditional and learning-based methods.
arXiv Detail & Related papers (2020-07-06T14:44:06Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z) - Unsupervised Multi-Modal Image Registration via Geometry Preserving
Image-to-Image Translation [43.060971647266236]
We train an image-to-image translation network on the two input modalities.
This learned translation allows training the registration network using simple and reliable mono-modality metrics.
Compared to state-of-the-art multi-modal methods our presented method is unsupervised, requiring no pairs of aligned modalities for training, and can be adapted to any pair of modalities.
arXiv Detail & Related papers (2020-03-18T07:21:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.