Related papers: ResViT: Residual vision transformers for multi-modal medical image synthesis

ResViT: Residual vision transformers for multi-modal medical image synthesis

URL: http://arxiv.org/abs/2106.16031v1
Date: Wed, 30 Jun 2021 12:57:37 GMT
Title: ResViT: Residual vision transformers for multi-modal medical image synthesis
Authors: Onat Dalmaz, Mahmut Yurt, Tolga \c{C}ukur
Abstract summary: We propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers. Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-modal imaging is a key healthcare technology in the diagnosis and management of disease, but it is often underutilized due to costs associated with multiple separate scans. This limitation yields the need for synthesis of unacquired modalities from the subset of available modalities. In recent years, generative adversarial network (GAN) models with superior depiction of structural details have been established as state-of-the-art in numerous medical image synthesis tasks. However, GANs are characteristically based on convolutional neural network (CNN) backbones that perform local processing with compact filters. This inductive bias, in turn, compromises learning of long-range spatial dependencies. While attention maps incorporated in GANs can multiplicatively modulate CNN features to emphasize critical image regions, their capture of global context is mostly implicit. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers. Based on an encoder-decoder architecture, ResViT employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine convolutional and transformer modules. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI and CT images from MRI. Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.

Related papers

MRNet: Multifaceted Resilient Networks for Medical Image-to-Image Translation [1.9384004397336387]
We propose a novel architecture developed for medical image-to-image translation that outperforms state-of-the-art methods in MRI-to-CT and MRI-to-MRI conversion. The architecture extracts comprehensive multiscale features from diverse datasets using a powerful SAM image encoder and performs resolution-aware feature fusion.
arXiv Detail & Related papers (2024-12-04T05:23:46Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
Deep neural networks have shown great potential for reconstructing high-fidelity images from undersampled measurements. Our model is based on neural operators, a discretization-agnostic architecture. Our inference speed is also 1,400x faster than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling [8.48392350084504]
We propose a novel adversarial model for medical image synthesis, I2I-Mamba, to efficiently capture long-range context. I2I-Mamba offers superior performance against state-of-the-art CNN- and transformer-based methods in synthesizing target-modality images.
arXiv Detail & Related papers (2024-05-22T21:55:58Z)
Disentangled Multimodal Brain MR Image Translation via Transformer-based Modality Infuser [12.402947207350394]
We propose a transformer-based modality infuser designed to synthesize multimodal brain MR images. In our method, we extract modality-agnostic features from the encoder and then transform them into modality-specific features. We carried out experiments on the BraTS 2018 dataset, translating between four MR modalities.
arXiv Detail & Related papers (2024-02-01T06:34:35Z)
CoNeS: Conditional neural fields with shift modulation for multi-sequence MRI translation [5.662694302758443]
Multi-sequence magnetic resonance imaging (MRI) has found wide applications in both modern clinical studies and deep learning research. It frequently occurs that one or more of the MRI sequences are missing due to different image acquisition protocols or contrast agent contraindications of patients. One promising approach is to leverage generative models to synthesize the missing sequences, which can serve as a surrogate acquisition.
arXiv Detail & Related papers (2023-09-06T19:01:58Z)
On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input. DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases. We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z)
Generative Adversarial Networks for Brain Images Synthesis: A Review [2.609784101826762]
In medical imaging, image synthesis is the estimation process of one image (sequence, modality) from another image (sequence, modality) generative adversarial network (GAN) as one of the most popular generative-based deep learning methods. We summarized the recent developments of GANs for cross-modality brain image synthesis including CT to PET, CT to MRI, MRI to PET, and vice versa.
arXiv Detail & Related papers (2023-05-16T17:28:06Z)
Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction [68.80715727288514]
We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix. In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction.
arXiv Detail & Related papers (2022-09-15T03:58:30Z)
One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer for Missing Data Imputation [3.9207133968068684]
We formulate missing data imputation as a sequence-to-sequence learning problem. We propose a multi-contrast multi-scale Transformer (MMT) which can take any subset of input contrasts and synthesize those that are missing. MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions.
arXiv Detail & Related papers (2022-04-28T18:49:27Z)
Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution [55.52779466954026]
Multi-contrast super-resolution (SR) reconstruction is promising to yield SR images with higher quality. Existing methods lack effective mechanisms to match and fuse these features for better reconstruction. We propose a novel network to address these problems by developing a set of innovative Transformer-empowered multi-scale contextual matching and aggregation techniques.
arXiv Detail & Related papers (2022-03-26T01:42:59Z)
Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality. In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network. Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z)
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems. On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard. We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
We propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis. In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality. A multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality.
arXiv Detail & Related papers (2020-02-11T08:26:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.