A Tri-attention Fusion Guided Multi-modal Segmentation Network
- URL: http://arxiv.org/abs/2111.01623v1
- Date: Tue, 2 Nov 2021 14:36:53 GMT
- Title: A Tri-attention Fusion Guided Multi-modal Segmentation Network
- Authors: Tongxue Zhou, Su Ruan, Pierre Vera and St\'ephane Canu
- Abstract summary: We propose a multi-modality segmentation network guided by a novel tri-attention fusion.
Our network includes N model-independent encoding paths with N image sources, a tri-attention fusion block, a dual-attention fusion block, and a decoding path.
Our experiment results tested on BraTS 2018 dataset for brain tumor segmentation demonstrate the effectiveness of our proposed method.
- Score: 2.867517731896504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of multimodal segmentation, the correlation between different
modalities can be considered for improving the segmentation results.
Considering the correlation between different MR modalities, in this paper, we
propose a multi-modality segmentation network guided by a novel tri-attention
fusion. Our network includes N model-independent encoding paths with N image
sources, a tri-attention fusion block, a dual-attention fusion block, and a
decoding path. The model independent encoding paths can capture
modality-specific features from the N modalities. Considering that not all the
features extracted from the encoders are useful for segmentation, we propose to
use dual attention based fusion to re-weight the features along the modality
and space paths, which can suppress less informative features and emphasize the
useful ones for each modality at different positions. Since there exists a
strong correlation between different modalities, based on the dual attention
fusion block, we propose a correlation attention module to form the
tri-attention fusion block. In the correlation attention module, a correlation
description block is first used to learn the correlation between modalities and
then a constraint based on the correlation is used to guide the network to
learn the latent correlated features which are more relevant for segmentation.
Finally, the obtained fused feature representation is projected by the decoder
to obtain the segmentation results. Our experiment results tested on BraTS 2018
dataset for brain tumor segmentation demonstrate the effectiveness of our
proposed method.
Related papers
- A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion [41.34335755315773]
Multi-modality image fusion aims at fusing specific-modality and shared-modality information from two source images.
We propose a three-branch encoder-decoder architecture along with corresponding fusion layers as the fusion strategy.
Our method has obtained competitive results compared with state-of-the-art methods in visible/infrared image fusion and medical image fusion tasks.
arXiv Detail & Related papers (2024-06-11T09:32:40Z) - DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and
Authentication [50.017055360261665]
We introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks.
For better feature interaction between these two branches, we introduce two specialized modules.
In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings.
arXiv Detail & Related papers (2024-02-03T06:49:42Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Discriminative Co-Saliency and Background Mining Transformer for
Co-Salient Object Detection [111.04994415248736]
We propose a Discriminative co-saliency and background Mining Transformer framework (DMT)
We use two types of pre-defined tokens to mine co-saliency and background information via our proposed contrast-induced pixel-to-token correlation and co-saliency token-to-token correlation modules.
Experimental results on three benchmark datasets demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-04-30T15:56:47Z) - FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced
Context-Aware Network [48.912196729711624]
Few-shot semantic segmentation is the task of learning to locate each pixel of a novel class in a query image with only a few annotated support images.
We propose a Feature-Enhanced Context-Aware Network (FECANet) to suppress the matching noise caused by inter-class local similarity.
In addition, we propose a novel correlation reconstruction module that encodes extra correspondence relations between foreground and background and multi-scale context semantic features.
arXiv Detail & Related papers (2023-01-19T16:31:13Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - 3D Medical Multi-modal Segmentation Network Guided by Multi-source
Correlation Constraint [2.867517731896504]
We propose a multi-modality segmentation network with a correlation constraint.
Our experiment results tested on BraTS-2018 dataset for brain tumor segmentation demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2021-02-05T11:23:12Z) - Brain tumor segmentation with missing modalities via latent multi-source
correlation representation [6.060020806741279]
A novel correlation representation block is proposed to specially discover the latent multi-source correlation.
Thanks to the obtained correlation representation, the segmentation becomes more robust in the case of missing modalities.
We evaluate our model on BraTS 2018 datasets, it outperforms the current state-of-the-art method and produces robust results when one or more modalities are missing.
arXiv Detail & Related papers (2020-03-19T15:47:36Z) - Bi-Directional Attention for Joint Instance and Semantic Segmentation in
Point Clouds [9.434847591440485]
We build a Bi-Directional Attention module on backbone neural networks for 3D point cloud perception.
It uses similarity matrix measured from features for one task to help aggregate non-local information for the other task.
From comprehensive experiments and ablation studies on the S3DIS dataset and the PartNet dataset, the superiority of our method is verified.
arXiv Detail & Related papers (2020-03-11T17:16:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.