NestedFormer: Nested Modality-Aware Transformer for Brain Tumor
Segmentation
- URL: http://arxiv.org/abs/2208.14876v1
- Date: Wed, 31 Aug 2022 14:04:25 GMT
- Title: NestedFormer: Nested Modality-Aware Transformer for Brain Tumor
Segmentation
- Authors: Zhaohu Xing and Lequan Yu and Liang Wan and Tong Han and Lei Zhu
- Abstract summary: We propose a novel Nested Modality-Aware Transformer (NestedFormer) to explore the intra-modality and inter-modality relationships of multi-modal MRIs for brain tumor segmentation.
Built on the transformer-based multi-encoder and single-decoder structure, we perform nested multi-modal fusion for high-level representations of different modalities.
- Score: 29.157465321864265
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multi-modal MR imaging is routinely used in clinical practice to diagnose and
investigate brain tumors by providing rich complementary information. Previous
multi-modal MRI segmentation methods usually perform modal fusion by
concatenating multi-modal MRIs at an early/middle stage of the network, which
hardly explores non-linear dependencies between modalities. In this work, we
propose a novel Nested Modality-Aware Transformer (NestedFormer) to explicitly
explore the intra-modality and inter-modality relationships of multi-modal MRIs
for brain tumor segmentation. Built on the transformer-based multi-encoder and
single-decoder structure, we perform nested multi-modal fusion for high-level
representations of different modalities and apply modality-sensitive gating
(MSG) at lower scales for more effective skip connections. Specifically, the
multi-modal fusion is conducted in our proposed Nested Modality-aware Feature
Aggregation (NMaFA) module, which enhances long-term dependencies within
individual modalities via a tri-orientated spatial-attention transformer, and
further complements key contextual information among modalities via a
cross-modality attention transformer. Extensive experiments on BraTS2020
benchmark and a private meningiomas segmentation (MeniSeg) dataset show that
the NestedFormer clearly outperforms the state-of-the-arts. The code is
available at https://github.com/920232796/NestedFormer.
Related papers
- Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding [51.96911650437978]
Multi-modal fusion has played a vital role in multi-modal scene understanding.
Most existing methods focus on cross-modal fusion involving two modalities, often overlooking more complex multi-modal fusion.
We propose a relational Part-Whole Fusion (PWRF) framework for multi-modal scene understanding.
arXiv Detail & Related papers (2024-10-19T02:27:30Z) - StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation [63.31007867379312]
We propose StitchFusion, a framework that integrates large-scale pre-trained models directly as encoders and feature fusers.
We introduce a multi-directional adapter module (MultiAdapter) to enable cross-modal information transfer during encoding.
Our model achieves state-of-the-art performance on four multi-modal segmentation datasets with minimal additional parameters.
arXiv Detail & Related papers (2024-08-02T15:41:16Z) - Multimodal Information Interaction for Medical Image Segmentation [24.024848382458767]
We introduce an innovative Multimodal Information Cross Transformer (MicFormer)
It queries features from one modality and retrieves corresponding responses from another, facilitating effective communication between bimodal features.
Compared to other multimodal segmentation techniques, our method outperforms by margins of 2.83 and 4.23, respectively.
arXiv Detail & Related papers (2024-04-25T07:21:14Z) - A Multimodal Feature Distillation with CNN-Transformer Network for Brain Tumor Segmentation with Incomplete Modalities [15.841483814265592]
We propose a Multimodal feature distillation with Convolutional Neural Network (CNN)-Transformer hybrid network (MCTSeg) for accurate brain tumor segmentation with missing modalities.
Our ablation study demonstrates the importance of the proposed modules with CNN-Transformer networks and the convolutional blocks in Transformer for improving the performance of brain tumor segmentation with missing modalities.
arXiv Detail & Related papers (2024-04-22T09:33:44Z) - Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation [12.094890186803958]
We present a novel Modality Aware and Shift Mixer that integrates intra-modality and inter-modality dependencies of multi-modal images for effective and robust brain tumor segmentation.
Specifically, we introduce a Modality-Aware module according to neuroimaging studies for modeling the specific modality pair relationships at low levels, and a Modality-Shift module with specific mosaic patterns is developed to explore the complex relationships across modalities at high levels via the self-attention.
arXiv Detail & Related papers (2024-03-04T14:21:51Z) - RGBT Tracking via Progressive Fusion Transformer with Dynamically Guided
Learning [37.067605349559]
We propose a novel Progressive Fusion Transformer called ProFormer.
It integrates single-modality information into the multimodal representation for robust RGBT tracking.
ProFormer sets a new state-of-the-art performance on RGBT210, RGBT234, LasHeR, and VTUAV datasets.
arXiv Detail & Related papers (2023-03-26T16:55:58Z) - A Novel Unified Conditional Score-based Generative Framework for
Multi-modal Medical Image Completion [54.512440195060584]
We propose the Unified Multi-Modal Conditional Score-based Generative Model (UMM-CSGM) to take advantage of Score-based Generative Model (SGM)
UMM-CSGM employs a novel multi-in multi-out Conditional Score Network (mm-CSN) to learn a comprehensive set of cross-modal conditional distributions.
Experiments on BraTS19 dataset show that the UMM-CSGM can more reliably synthesize the heterogeneous enhancement and irregular area in tumor-induced lesions.
arXiv Detail & Related papers (2022-07-07T16:57:21Z) - Multi-scale Cooperative Multimodal Transformers for Multimodal Sentiment
Analysis in Videos [58.93586436289648]
We propose a multi-scale cooperative multimodal transformer (MCMulT) architecture for multimodal sentiment analysis.
Our model outperforms existing approaches on unaligned multimodal sequences and has strong performance on aligned multimodal sequences.
arXiv Detail & Related papers (2022-06-16T07:47:57Z) - mmFormer: Multimodal Medical Transformer for Incomplete Multimodal
Learning of Brain Tumor Segmentation [38.22852533584288]
We propose a novel Medical Transformer (mmFormer) for incomplete multimodal learning with three main components.
The proposed mmFormer outperforms the state-of-the-art methods for incomplete multimodal brain tumor segmentation on almost all subsets of incomplete modalities.
arXiv Detail & Related papers (2022-06-06T08:41:56Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z) - Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
We propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis.
In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality.
A multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality.
arXiv Detail & Related papers (2020-02-11T08:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.