ResViT: Residual vision transformers for multi-modal medical image
synthesis
- URL: http://arxiv.org/abs/2106.16031v1
- Date: Wed, 30 Jun 2021 12:57:37 GMT
- Title: ResViT: Residual vision transformers for multi-modal medical image
synthesis
- Authors: Onat Dalmaz, Mahmut Yurt, Tolga \c{C}ukur
- Abstract summary: We propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers.
Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal imaging is a key healthcare technology in the diagnosis and
management of disease, but it is often underutilized due to costs associated
with multiple separate scans. This limitation yields the need for synthesis of
unacquired modalities from the subset of available modalities. In recent years,
generative adversarial network (GAN) models with superior depiction of
structural details have been established as state-of-the-art in numerous
medical image synthesis tasks. However, GANs are characteristically based on
convolutional neural network (CNN) backbones that perform local processing with
compact filters. This inductive bias, in turn, compromises learning of
long-range spatial dependencies. While attention maps incorporated in GANs can
multiplicatively modulate CNN features to emphasize critical image regions,
their capture of global context is mostly implicit. Here, we propose a novel
generative adversarial approach for medical image synthesis, ResViT, to combine
local precision of convolution operators with contextual sensitivity of vision
transformers. Based on an encoder-decoder architecture, ResViT employs a
central bottleneck comprising novel aggregated residual transformer (ART)
blocks that synergistically combine convolutional and transformer modules.
Comprehensive demonstrations are performed for synthesizing missing sequences
in multi-contrast MRI and CT images from MRI. Our results indicate the
superiority of ResViT against competing methods in terms of qualitative
observations and quantitative metrics.
Related papers
- A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
Deep neural networks have shown great potential for reconstructing high-fidelity images from undersampled measurements.
Our model is based on neural operators, a discretization-agnostic architecture.
Our inference speed is also 1,400x faster than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling [8.48392350084504]
We propose a novel adversarial model for medical image synthesis, I2I-Mamba, to efficiently capture long-range context.
I2I-Mamba offers superior performance against state-of-the-art CNN- and transformer-based methods in synthesizing target-modality images.
arXiv Detail & Related papers (2024-05-22T21:55:58Z) - Disentangled Multimodal Brain MR Image Translation via Transformer-based
Modality Infuser [12.402947207350394]
We propose a transformer-based modality infuser designed to synthesize multimodal brain MR images.
In our method, we extract modality-agnostic features from the encoder and then transform them into modality-specific features.
We carried out experiments on the BraTS 2018 dataset, translating between four MR modalities.
arXiv Detail & Related papers (2024-02-01T06:34:35Z) - CoNeS: Conditional neural fields with shift modulation for multi-sequence MRI translation [5.662694302758443]
Multi-sequence magnetic resonance imaging (MRI) has found wide applications in both modern clinical studies and deep learning research.
It frequently occurs that one or more of the MRI sequences are missing due to different image acquisition protocols or contrast agent contraindications of patients.
One promising approach is to leverage generative models to synthesize the missing sequences, which can serve as a surrogate acquisition.
arXiv Detail & Related papers (2023-09-06T19:01:58Z) - On Sensitivity and Robustness of Normalization Schemes to Input
Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input.
DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases.
We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z) - Generative Adversarial Networks for Brain Images Synthesis: A Review [2.609784101826762]
In medical imaging, image synthesis is the estimation process of one image (sequence, modality) from another image (sequence, modality)
generative adversarial network (GAN) as one of the most popular generative-based deep learning methods.
We summarized the recent developments of GANs for cross-modality brain image synthesis including CT to PET, CT to MRI, MRI to PET, and vice versa.
arXiv Detail & Related papers (2023-05-16T17:28:06Z) - Model-Guided Multi-Contrast Deep Unfolding Network for MRI
Super-resolution Reconstruction [68.80715727288514]
We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix.
In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction.
arXiv Detail & Related papers (2022-09-15T03:58:30Z) - One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer
for Missing Data Imputation [3.9207133968068684]
We formulate missing data imputation as a sequence-to-sequence learning problem.
We propose a multi-contrast multi-scale Transformer (MMT) which can take any subset of input contrasts and synthesize those that are missing.
MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions.
arXiv Detail & Related papers (2022-04-28T18:49:27Z) - Transformer-empowered Multi-scale Contextual Matching and Aggregation
for Multi-contrast MRI Super-resolution [55.52779466954026]
Multi-contrast super-resolution (SR) reconstruction is promising to yield SR images with higher quality.
Existing methods lack effective mechanisms to match and fuse these features for better reconstruction.
We propose a novel network to address these problems by developing a set of innovative Transformer-empowered multi-scale contextual matching and aggregation techniques.
arXiv Detail & Related papers (2022-03-26T01:42:59Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
We propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis.
In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality.
A multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality.
arXiv Detail & Related papers (2020-02-11T08:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.