Multimodal Hyperspectral Image Classification via Interconnected Fusion
- URL: http://arxiv.org/abs/2304.00495v1
- Date: Sun, 2 Apr 2023 09:46:13 GMT
- Title: Multimodal Hyperspectral Image Classification via Interconnected Fusion
- Authors: Lu Huo, Jiahao Xia, Leijie Zhang, Haimin Zhang, Min Xu
- Abstract summary: An Interconnected Fusion (IF) framework is proposed to explore the relationships across HSI and LiDAR modalities comprehensively.
Experiments have been conducted on three widely used datasets: Trento, MUUFL, and Houston.
- Score: 12.41850641917384
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing multiple modality fusion methods, such as concatenation, summation,
and encoder-decoder-based fusion, have recently been employed to combine
modality characteristics of Hyperspectral Image (HSI) and Light Detection And
Ranging (LiDAR). However, these methods consider the relationship of HSI-LiDAR
signals from limited perspectives. More specifically, they overlook the
contextual information across modalities of HSI and LiDAR and the
intra-modality characteristics of LiDAR. In this paper, we provide a new
insight into feature fusion to explore the relationships across HSI and LiDAR
modalities comprehensively. An Interconnected Fusion (IF) framework is
proposed. Firstly, the center patch of the HSI input is extracted and
replicated to the size of the HSI input. Then, nine different perspectives in
the fusion matrix are generated by calculating self-attention and
cross-attention among the replicated center patch, HSI input, and corresponding
LiDAR input. In this way, the intra- and inter-modality characteristics can be
fully exploited, and contextual information is considered in both
intra-modality and inter-modality manner. These nine interrelated elements in
the fusion matrix can complement each other and eliminate biases, which can
generate a multi-modality representation for classification accurately.
Extensive experiments have been conducted on three widely used datasets:
Trento, MUUFL, and Houston. The IF framework achieves state-of-the-art results
on these datasets compared to existing approaches.
Related papers
- How Intermodal Interaction Affects the Performance of Deep Multimodal Fusion for Mixed-Type Time Series [3.6958071416494414]
Mixed-type time series (MTTS) is a bimodal data type common in many domains, such as healthcare, finance, environmental monitoring, and social media.
The integration of both modalities through multimodal fusion is a promising approach for processing MTTS.
We present a comprehensive evaluation of several deep multimodal fusion approaches for MTTS forecasting.
arXiv Detail & Related papers (2024-06-21T12:26:48Z) - Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Fusion-Mamba for Cross-modality Object Detection [63.56296480951342]
Cross-modality fusing information from different modalities effectively improves object detection performance.
We design a Fusion-Mamba block (FMB) to map cross-modal features into a hidden state space for interaction.
Our proposed approach outperforms the state-of-the-art methods on $m$AP with 5.9% on $M3FD$ and 4.9% on FLIR-Aligned datasets.
arXiv Detail & Related papers (2024-04-14T05:28:46Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera
Joint Synthesis [98.3959800235485]
Recently, there exist some methods exploring multiple modalities within a single field, aiming to share implicit features from different modalities to enhance reconstruction performance.
In this work, we conduct comprehensive analyses on the multimodal implicit field of LiDAR-camera joint synthesis, revealing the underlying issue lies in the misalignment of different sensors.
We introduce AlignMiF, a geometrically aligned multimodal implicit field with two proposed modules: Geometry-Aware Alignment (GAA) and Shared Geometry Initialization (SGI)
arXiv Detail & Related papers (2024-02-27T13:08:47Z) - Deep Equilibrium Multimodal Fusion [88.04713412107947]
Multimodal fusion integrates the complementary information present in multiple modalities and has gained much attention recently.
We propose a novel deep equilibrium (DEQ) method towards multimodal fusion via seeking a fixed point of the dynamic multimodal fusion process.
Experiments on BRCA, MM-IMDB, CMU-MOSI, SUN RGB-D, and VQA-v2 demonstrate the superiority of our DEQ fusion.
arXiv Detail & Related papers (2023-06-29T03:02:20Z) - A Low-rank Matching Attention based Cross-modal Feature Fusion Method
for Conversational Emotion Recognition [56.20144064187554]
This paper develops a novel cross-modal feature fusion method for the Conversational emotion recognition (CER) task.
By setting a matching weight and calculating attention scores between modal features row by row, LMAM contains fewer parameters than the self-attention method.
We show that LMAM can be embedded into any existing state-of-the-art DL-based CER methods and help boost their performance in a plug-and-play manner.
arXiv Detail & Related papers (2023-06-16T16:02:44Z) - A Tri-attention Fusion Guided Multi-modal Segmentation Network [2.867517731896504]
We propose a multi-modality segmentation network guided by a novel tri-attention fusion.
Our network includes N model-independent encoding paths with N image sources, a tri-attention fusion block, a dual-attention fusion block, and a decoding path.
Our experiment results tested on BraTS 2018 dataset for brain tumor segmentation demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2021-11-02T14:36:53Z) - Two Headed Dragons: Multimodal Fusion and Cross Modal Transactions [14.700807572189412]
We propose a novel transformer based fusion method for HSI and LiDAR modalities.
The model is composed of stacked auto encoders that harness the cross key-value pairs for HSI and LiDAR.
We test our model on Houston (Data Fusion Contest - 2013) and MUUFL Gulfport datasets and achieve competitive results.
arXiv Detail & Related papers (2021-07-24T11:33:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.