Disentangled Multimodal Brain MR Image Translation via Transformer-based
Modality Infuser
- URL: http://arxiv.org/abs/2402.00375v1
- Date: Thu, 1 Feb 2024 06:34:35 GMT
- Title: Disentangled Multimodal Brain MR Image Translation via Transformer-based
Modality Infuser
- Authors: Jihoon Cho, Xiaofeng Liu, Fangxu Xing, Jinsong Ouyang, Georges El
Fakhri, Jinah Park, Jonghye Woo
- Abstract summary: We propose a transformer-based modality infuser designed to synthesize multimodal brain MR images.
In our method, we extract modality-agnostic features from the encoder and then transform them into modality-specific features.
We carried out experiments on the BraTS 2018 dataset, translating between four MR modalities.
- Score: 12.402947207350394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal Magnetic Resonance (MR) Imaging plays a crucial role in disease
diagnosis due to its ability to provide complementary information by analyzing
a relationship between multimodal images on the same subject. Acquiring all MR
modalities, however, can be expensive, and, during a scanning session, certain
MR images may be missed depending on the study protocol. The typical solution
would be to synthesize the missing modalities from the acquired images such as
using generative adversarial networks (GANs). Yet, GANs constructed with
convolutional neural networks (CNNs) are likely to suffer from a lack of global
relationships and mechanisms to condition the desired modality. To address
this, in this work, we propose a transformer-based modality infuser designed to
synthesize multimodal brain MR images. In our method, we extract
modality-agnostic features from the encoder and then transform them into
modality-specific features using the modality infuser. Furthermore, the
modality infuser captures long-range relationships among all brain structures,
leading to the generation of more realistic images. We carried out experiments
on the BraTS 2018 dataset, translating between four MR modalities, and our
experimental results demonstrate the superiority of our proposed method in
terms of synthesis quality. In addition, we conducted experiments on a brain
tumor segmentation task and different conditioning methods.
Related papers
- Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation [51.28453192441364]
Multimodal brain magnetic resonance (MR) imaging is indispensable in neuroscience and neurology.
Current MR image synthesis approaches are typically trained on independent datasets for specific tasks.
We present TUMSyn, a Text-guided Universal MR image Synthesis model, which can flexibly generate brain MR images.
arXiv Detail & Related papers (2024-09-25T11:14:47Z) - A Unified Framework for Synthesizing Multisequence Brain MRI via Hybrid Fusion [4.47838172826189]
We propose a novel unified framework for synthesizing multisequence MR images, called Hybrid Fusion GAN (HF-GAN)
We introduce a hybrid fusion encoder designed to ensure the disentangled extraction of complementary and modality-specific information.
Common feature representations are transformed into a target latent space via the modality infuser to synthesize missing MR sequences.
arXiv Detail & Related papers (2024-06-21T08:06:00Z) - MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer.
This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation.
Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals.
By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z) - Enhancing CT Image synthesis from multi-modal MRI data based on a
multi-task neural network framework [16.864720020158906]
We propose a versatile multi-task neural network framework, based on an enhanced Transformer U-Net architecture.
We decompose the traditional problem of synthesizing CT images into distinct subtasks.
To enhance the framework's versatility in handling multi-modal data, we expand the model with multiple image channels.
arXiv Detail & Related papers (2023-12-13T18:22:38Z) - A Learnable Variational Model for Joint Multimodal MRI Reconstruction
and Synthesis [4.056490719080639]
We propose a novel deep-learning model for joint reconstruction and synthesis of multi-modal MRI.
The output of our model includes reconstructed images of the source modalities and high-quality image synthesized in the target modality.
arXiv Detail & Related papers (2022-04-08T01:35:19Z) - Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality.
In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network.
Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z) - Modality Completion via Gaussian Process Prior Variational Autoencoders
for Multi-Modal Glioma Segmentation [75.58395328700821]
We propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
MGP-VAE can leverage the Gaussian Process (GP) prior on the Variational Autoencoder (VAE) to utilize the subjects/patients and sub-modalities correlations.
We show the applicability of MGP-VAE on brain tumor segmentation where either, two, or three of four sub-modalities may be missing.
arXiv Detail & Related papers (2021-07-07T19:06:34Z) - ResViT: Residual vision transformers for multi-modal medical image
synthesis [0.0]
We propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers.
Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.
arXiv Detail & Related papers (2021-06-30T12:57:37Z) - Representation Disentanglement for Multi-modal MR Analysis [15.498244253687337]
Recent works have suggested that multi-modal deep learning analysis can benefit from explicitly disentangling anatomical (shape) and modality (appearance) representations from the images.
We propose a margin loss that regularizes the similarity relationships of the representations across subjects and modalities.
To enable a robust training, we introduce a modified conditional convolution to design a single model for encoding images of all modalities.
arXiv Detail & Related papers (2021-02-23T02:08:38Z) - Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
We propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis.
In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality.
A multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality.
arXiv Detail & Related papers (2020-02-11T08:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.