Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive
Review
- URL: http://arxiv.org/abs/2205.01380v1
- Date: Tue, 3 May 2022 09:08:16 GMT
- Title: Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive
Review
- Authors: Jiaxin Li, Danfeng Hong, Lianru Gao, Jing Yao, Ke Zheng, Bing Zhang,
Jocelyn Chanussot
- Abstract summary: This survey aims to present a systematic overview in DL-based multimodal RS data fusion.
Sub-fields in the multimodal RS data fusion are reviewed in terms of to-be-fused data modalities.
The remaining challenges and potential future directions are highlighted.
- Score: 33.40031994803646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the extremely rapid advances in remote sensing (RS) technology, a great
quantity of Earth observation (EO) data featuring considerable and complicated
heterogeneity is readily available nowadays, which renders researchers an
opportunity to tackle current geoscience applications in a fresh way. With the
joint utilization of EO data, much research on multimodal RS data fusion has
made tremendous progress in recent years, yet these developed traditional
algorithms inevitably meet the performance bottleneck due to the lack of the
ability to comprehensively analyse and interpret these strongly heterogeneous
data. Hence, this non-negligible limitation further arouses an intense demand
for an alternative tool with powerful processing competence. Deep learning
(DL), as a cutting-edge technology, has witnessed remarkable breakthroughs in
numerous computer vision tasks owing to its impressive ability in data
representation and reconstruction. Naturally, it has been successfully applied
to the field of multimodal RS data fusion, yielding great improvement compared
with traditional methods. This survey aims to present a systematic overview in
DL-based multimodal RS data fusion. More specifically, some essential knowledge
about this topic is first given. Subsequently, a literature survey is conducted
to analyse the trends of this field. Some prevalent sub-fields in the
multimodal RS data fusion are then reviewed in terms of the to-be-fused data
modalities, i.e., spatiospectral, spatiotemporal, light detection and
ranging-optical, synthetic aperture radar-optical, and RS-Geospatial Big Data
fusion. Furthermore, We collect and summarize some valuable resources for the
sake of the development in multimodal RS data fusion. Finally, the remaining
challenges and potential future directions are highlighted.
Related papers
- Foundation Models for Remote Sensing and Earth Observation: A Survey [101.77425018347557]
This survey systematically reviews the emerging field of Remote Sensing Foundation Models (RSFMs)
It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts.
We benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions.
arXiv Detail & Related papers (2024-10-22T01:08:21Z) - From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems.
The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness.
This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z) - Earth System Data Cubes: Avenues for advancing Earth system research [4.408949931570938]
Earth System Data Cubes ( ESDCs) have emerged as one suitable solution for transforming this flood of data into a simple yet robust format.
ESDCs achieve this by organising data into an analysis-ready format with atemporal grid.
There exist barriers to realising the full potential of data in light of novel cloud-based technologies.
arXiv Detail & Related papers (2024-08-05T09:50:16Z) - Semantic-Aware Representation of Multi-Modal Data for Data Ingress: A Literature Review [1.8590097948961688]
Generative AI such as Large Language Models (LLMs) sees broad adoption to process multi-modal data such as text, images, audio, and video.
Managing this data efficiently has become a significant practical challenge in the industry-double as much data is not double as good.
This study focuses on the different semantic-aware techniques to extract embeddings from mono-modal, multi-modal, and cross-modal data.
arXiv Detail & Related papers (2024-07-17T09:49:11Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset
and Comprehensive Framework [51.44863255495668]
Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence.
We present Multi-Modal Reasoning(COCO-MMR) dataset, a novel dataset that encompasses an extensive collection of open-ended questions.
We propose innovative techniques, including multi-hop cross-modal attention and sentence-level contrastive learning, to enhance the image and text encoders.
arXiv Detail & Related papers (2023-07-24T08:58:25Z) - Tensor Decompositions for Hyperspectral Data Processing in Remote
Sensing: A Comprehensive Review [85.36368666877412]
hyperspectral (HS) remote sensing (RS) imaging has provided a significant amount of spatial and spectral information for the observation and analysis of the Earth's surface.
The recent advancement and even revolution of the HS RS technique offer opportunities to realize the full potential of various applications.
Due to the maintenance of the 3-D HS inherent structure, tensor decomposition has aroused widespread concern and research in HS data processing tasks.
arXiv Detail & Related papers (2022-05-13T00:39:23Z) - A systematic review and meta-analysis of Digital Elevation Model (DEM)
fusion: pre-processing, methods and applications [0.0]
2.5D/3D Digital Elevation Model (DEM) fusion is a key application of data fusion in remote sensing.
DEM fusion takes advantage of the complementary characteristics of multi-source DEMs to deliver a more complete, accurate and reliable dataset.
This paper provides a systematic review of DEM fusion: the pre-processing workflow, methods and applications, enhanced with a meta-analysis.
arXiv Detail & Related papers (2022-03-28T18:39:14Z) - Paradigm selection for Data Fusion of SAR and Multispectral Sentinel
data applied to Land-Cover Classification [63.072664304695465]
In this letter, four data fusion paradigms, based on Convolutional Neural Networks (CNNs) are analyzed and implemented.
The goals are to provide a systematic procedure for choosing the best data fusion framework, resulting in the best classification results.
The procedure has been validated for land-cover classification but it can be transferred to other cases.
arXiv Detail & Related papers (2021-06-18T11:36:54Z) - Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and
Fusion [6.225190099424806]
Multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects.
Most of the existing state-of-the-art focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance.
Deep neural networks have exhibited as a powerful architecture to well capture the nonlinear distribution of high-dimensional multimedia data.
arXiv Detail & Related papers (2020-06-15T06:42:04Z) - Deep Learning in Mining Biological Data [7.834172629639729]
Deep learning (DL) has been successfully applied to solve many complex pattern recognition problems.
This article provides applications of DL to biological sequences, images, and signals data.
It also outlines some open research challenges in mining biological data.
arXiv Detail & Related papers (2020-02-28T23:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.