Reliable Multi-modal Medical Image-to-image Translation Independent of Pixel-wise Aligned Data
- URL: http://arxiv.org/abs/2408.14270v1
- Date: Mon, 26 Aug 2024 13:45:58 GMT
- Title: Reliable Multi-modal Medical Image-to-image Translation Independent of Pixel-wise Aligned Data
- Authors: Langrui Zhou, Guang Li,
- Abstract summary: We develop a novel multi-modal medical image-to-image translation model independent of pixel-wise aligned data (MITIA)
MITIA achieves superior performance compared to six other state-of-the-art image-to-image translation methods.
- Score: 2.328200803738193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The current mainstream multi-modal medical image-to-image translation methods face a contradiction. Supervised methods with outstanding performance rely on pixel-wise aligned training data to constrain the model optimization. However, obtaining pixel-wise aligned multi-modal medical image datasets is challenging. Unsupervised methods can be trained without paired data, but their reliability cannot be guaranteed. At present, there is no ideal multi-modal medical image-to-image translation method that can generate reliable translation results without the need for pixel-wise aligned data. This work aims to develop a novel medical image-to-image translation model that is independent of pixel-wise aligned data (MITIA), enabling reliable multi-modal medical image-to-image translation under the condition of misaligned training data. The proposed MITIA model utilizes a prior extraction network composed of a multi-modal medical image registration module and a multi-modal misalignment error detection module to extract pixel-level prior information from training data with misalignment errors to the largest extent. The extracted prior information is then used to construct a regularization term to constrain the optimization of the unsupervised cycle-consistent GAN model, restricting its solution space and thereby improving the performance and reliability of the generator. We trained the MITIA model using six datasets containing different misalignment errors and two well-aligned datasets. Subsequently, we compared the proposed method with six other state-of-the-art image-to-image translation methods. The results of both quantitative analysis and qualitative visual inspection indicate that MITIA achieves superior performance compared to the competing state-of-the-art methods, both on misaligned data and aligned data.
Related papers
- MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation [26.67518950976257]
We propose a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation.
Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods.
arXiv Detail & Related papers (2024-04-06T03:02:47Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image
Translation [10.689788782893096]
An unpaired image-to-image (I2I) translation technique seeks to find a mapping between two domains of data in a fully unsupervised manner.
DMs hold the state-of-the-art status on the I2I translation benchmarks in terms of Frechet distance (FID)
This work improves a recent UVCGAN model and equips it with modern advancements in model architectures and training procedures.
arXiv Detail & Related papers (2023-03-28T19:46:34Z) - Semi-Supervised Image Captioning by Adversarially Propagating Labeled
Data [95.0476489266988]
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.
Our proposed method trains a captioner to learn from a paired data and to progressively associate unpaired data.
Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired dataset.
arXiv Detail & Related papers (2023-01-26T15:25:43Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Beyond Triplet: Leveraging the Most Data for Multimodal Machine
Translation [53.342921374639346]
Multimodal machine translation aims to improve translation quality by incorporating information from other modalities, such as vision.
Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets.
This paper establishes new methods and new datasets for MMT.
arXiv Detail & Related papers (2022-12-20T15:02:38Z) - Unsupervised Multi-Modal Medical Image Registration via
Discriminator-Free Image-to-Image Translation [4.43142018105102]
We propose a novel translation-based unsupervised deformable image registration approach to convert the multi-modal registration problem to a mono-modal one.
Our approach incorporates a discriminator-free translation network to facilitate the training of the registration network and a patchwise contrastive loss to encourage the translation network to preserve object shapes.
arXiv Detail & Related papers (2022-04-28T17:18:21Z) - Dual Contrastive Learning for Unsupervised Image-to-Image Translation [16.759958400617947]
Unsupervised image-to-image translation tasks aim to find a mapping between a source domain X and a target domain Y from unpaired training data.
Contrastive learning for Unpaired image-to-image Translation yields state-of-the-art results.
We propose a novel method based on contrastive learning and a dual learning setting to infer an efficient mapping between unpaired data.
arXiv Detail & Related papers (2021-04-15T18:00:22Z) - Flow-based Deformation Guidance for Unpaired Multi-Contrast MRI
Image-to-Image Translation [7.8333615755210175]
In this paper, we introduce a novel approach to unpaired image-to-image translation based on the invertible architecture.
We utilize the temporal information between consecutive slices to provide more constraints to the optimization for transforming one domain to another in unpaired medical images.
arXiv Detail & Related papers (2020-12-03T09:10:22Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.