Accelerated Multi-Modal MR Imaging with Transformers
- URL: http://arxiv.org/abs/2106.14248v2
- Date: Tue, 29 Jun 2021 13:37:15 GMT
- Title: Accelerated Multi-Modal MR Imaging with Transformers
- Authors: Chun-Mei Feng and Yunlu Yan and Geng Chen, Huazhu Fu and Yong Xu and
Ling Shao
- Abstract summary: We propose a multi-modal transformer (MTrans) for accelerated MR imaging.
By restructuring the transformer architecture, our MTrans gains a powerful ability to capture deep multi-modal information.
Our framework provides two appealing benefits: (i) MTrans is the first attempt at using improved transformers for multi-modal MR imaging, affording more global information compared with CNN-based methods.
- Score: 92.18406564785329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accelerating multi-modal magnetic resonance (MR) imaging is a new and
effective solution for fast MR imaging, providing superior performance in
restoring the target modality from its undersampled counterpart with guidance
from an auxiliary modality. However, existing works simply introduce the
auxiliary modality as prior information, lacking in-depth investigations on the
potential mechanisms for fusing two modalities. Further, they usually rely on
the convolutional neural networks (CNNs), which focus on local information and
prevent them from fully capturing the long-distance dependencies of global
knowledge. To this end, we propose a multi-modal transformer (MTrans), which is
capable of transferring multi-scale features from the target modality to the
auxiliary modality, for accelerated MR imaging. By restructuring the
transformer architecture, our MTrans gains a powerful ability to capture deep
multi-modal information. More specifically, the target modality and the
auxiliary modality are first split into two branches and then fused using a
multi-modal transformer module. This module is based on an improved multi-head
attention mechanism, named the cross attention module, which absorbs features
from the auxiliary modality that contribute to the target modality. Our
framework provides two appealing benefits: (i) MTrans is the first attempt at
using improved transformers for multi-modal MR imaging, affording more global
information compared with CNN-based methods. (ii) A new cross attention module
is proposed to exploit the useful information in each branch at different
scales. It affords both distinct structural information and subtle pixel-level
information, which supplement the target modality effectively.
Related papers
- Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning [50.74383395813782]
We propose a novel Frequency and Spatial Mutual Learning Network (FSMNet) to explore global dependencies across different modalities.
The proposed FSMNet achieves state-of-the-art performance for the Multi-Contrast MR Reconstruction task with different acceleration factors.
arXiv Detail & Related papers (2024-09-21T12:02:47Z) - MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion [17.084083262801737]
We propose MMR-Mamba, a novel framework that thoroughly and efficiently integrates multi-modal features for MRI reconstruction.
Specifically, we first design a Target modality-guided Cross Mamba (TCM) module in the spatial domain.
Then, we introduce a Selective Frequency Fusion (SFF) module to efficiently integrate global information in the Fourier domain.
arXiv Detail & Related papers (2024-06-27T07:30:54Z) - Multimodal Information Interaction for Medical Image Segmentation [24.024848382458767]
We introduce an innovative Multimodal Information Cross Transformer (MicFormer)
It queries features from one modality and retrieves corresponding responses from another, facilitating effective communication between bimodal features.
Compared to other multimodal segmentation techniques, our method outperforms by margins of 2.83 and 4.23, respectively.
arXiv Detail & Related papers (2024-04-25T07:21:14Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - GA-HQS: MRI reconstruction via a generically accelerated unfolding
approach [14.988694941405575]
We propose a Generically Accelerated Half-Quadratic Splitting (GA-HQS) algorithm that incorporates second-order gradient information and pyramid attention modules for the delicate fusion of inputs at the pixel level.
Our method surpasses previous ones on single-coil MRI acceleration tasks.
arXiv Detail & Related papers (2023-04-06T06:21:18Z) - RGBT Tracking via Progressive Fusion Transformer with Dynamically Guided
Learning [37.067605349559]
We propose a novel Progressive Fusion Transformer called ProFormer.
It integrates single-modality information into the multimodal representation for robust RGBT tracking.
ProFormer sets a new state-of-the-art performance on RGBT210, RGBT234, LasHeR, and VTUAV datasets.
arXiv Detail & Related papers (2023-03-26T16:55:58Z) - SIM-Trans: Structure Information Modeling Transformer for Fine-grained
Visual Categorization [59.732036564862796]
We propose the Structure Information Modeling Transformer (SIM-Trans) to incorporate object structure information into transformer for enhancing discriminative representation learning.
The proposed two modules are light-weighted and can be plugged into any transformer network and trained end-to-end easily.
Experiments and analyses demonstrate that the proposed SIM-Trans achieves state-of-the-art performance on fine-grained visual categorization benchmarks.
arXiv Detail & Related papers (2022-08-31T03:00:07Z) - Cross-Modality High-Frequency Transformer for MR Image Super-Resolution [100.50972513285598]
We build an early effort to build a Transformer-based MR image super-resolution framework.
We consider two-fold domain priors including the high-frequency structure prior and the inter-modality context prior.
We establish a novel Transformer architecture, called Cross-modality high-frequency Transformer (Cohf-T), to introduce such priors into super-resolving the low-resolution images.
arXiv Detail & Related papers (2022-03-29T07:56:55Z) - Multi-modal land cover mapping of remote sensing images using pyramid
attention and gated fusion networks [20.66034058363032]
We propose a new multi-modality network for land cover mapping of multi-modal remote sensing data based on a novel pyramid attention fusion (PAF) module and a gated fusion unit (GFU)
PAF module is designed to efficiently obtain rich fine-grained contextual representations from each modality with a built-in cross-level and cross-view attention fusion mechanism.
GFU module utilizes a novel gating mechanism for early merging of features, thereby diminishing hidden redundancies and noise.
arXiv Detail & Related papers (2021-11-06T10:01:01Z) - Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality.
In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network.
Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.