Common Practices and Taxonomy in Deep Multi-view Fusion for Remote
Sensing Applications
- URL: http://arxiv.org/abs/2301.01200v1
- Date: Tue, 20 Dec 2022 15:12:27 GMT
- Title: Common Practices and Taxonomy in Deep Multi-view Fusion for Remote
Sensing Applications
- Authors: Francisco Mena and Diego Arenas and Marlon Nuske and Andreas Dengel
- Abstract summary: Advances in remote sensing technologies have boosted applications for Earth observation.
Deep learning models have been applied to fuse the information from multiple views.
This article gathers works on multi-view fusion for Earth observation by focusing on the common practices and approaches used in the literature.
- Score: 3.883984493622102
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The advances in remote sensing technologies have boosted applications for
Earth observation. These technologies provide multiple observations or views
with different levels of information. They might contain static or temporary
views with different levels of resolution, in addition to having different
types and amounts of noise due to sensor calibration or deterioration. A great
variety of deep learning models have been applied to fuse the information from
these multiple views, known as deep multi-view or multi-modal fusion learning.
However, the approaches in the literature vary greatly since different
terminology is used to refer to similar concepts or different illustrations are
given to similar techniques. This article gathers works on multi-view fusion
for Earth observation by focusing on the common practices and approaches used
in the literature. We summarize and structure insights from several different
publications concentrating on unifying points and ideas. In this manuscript, we
provide a harmonized terminology while at the same time mentioning the various
alternative terms that are used in literature. The topics covered by the works
reviewed focus on supervised learning with the use of neural network models. We
hope this review, with a long list of recent references, can support future
research and lead to a unified advance in the area.
Related papers
- Towards Visual Grounding: A Survey [87.37662490666098]
Since 2021, visual grounding has witnessed significant advancements, with emerging new concepts such as grounded pre-training.
This survey is designed to be suitable for both beginners and experienced researchers, serving as an invaluable resource for understanding key concepts and tracking the latest research developments.
arXiv Detail & Related papers (2024-12-28T16:34:35Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing
Objects in 3D Scenes [80.20670062509723]
3D dense captioning is an emerging vision-language bridging task that aims to generate detailed descriptions for 3D scenes.
It presents significant potential and challenges due to its closer representation of the real world compared to 2D visual captioning.
Despite the popularity and success of existing methods, there is a lack of comprehensive surveys summarizing the advancements in this field.
arXiv Detail & Related papers (2024-03-12T10:04:08Z) - Towards Open Vocabulary Learning: A Survey [146.90188069113213]
Deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection.
Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training.
This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2023-06-28T02:33:06Z) - Multispectral Contrastive Learning with Viewmaker Networks [8.635434871127512]
We focus on applying contrastive learning approaches to a variety of remote sensing datasets.
We show that Viewmaker networks are promising for producing views in this setting without requiring extensive domain knowledge and trial and error.
arXiv Detail & Related papers (2023-02-11T18:44:12Z) - Visual SLAM: What are the Current Trends and What to Expect? [0.0]
Vision-based sensors have shown significant performance, accuracy, and efficiency gain in Simultaneous localization and Mapping (SLAM) systems.
We have given an in-depth literature survey of forty-five impactful papers published in the domain of VSLAMs.
arXiv Detail & Related papers (2022-10-19T11:56:32Z) - Vision+X: A Survey on Multimodal Learning in the Light of Data [64.03266872103835]
multimodal machine learning that incorporates data from various sources has become an increasingly popular research area.
We analyze the commonness and uniqueness of each data format mainly ranging from vision, audio, text, and motions.
We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels.
arXiv Detail & Related papers (2022-10-05T13:14:57Z) - Deflectometry for specular surfaces: an overview [0.0]
Deflectometry as a technical approach to assessing reflective surfaces has now existed for almost 40 years.
Different aspects and variations of the method have been studied in multiple theses and research articles, and reviews are also becoming available for certain subtopics.
arXiv Detail & Related papers (2022-04-10T22:17:47Z) - Recent Advances and Trends in Multimodal Deep Learning: A Review [9.11022096530605]
Multimodal deep learning aims to create models that can process and link information using various modalities.
This paper focuses on multiple types of modalities, i.e., image, video, text, audio, body gestures, facial expressions, and physiological signals.
A fine-grained taxonomy of various multimodal deep learning applications is proposed, elaborating on different applications in more depth.
arXiv Detail & Related papers (2021-05-24T04:20:45Z) - Stance Detection in Web and Social Media: A Comparative Study [3.937145867005019]
Online forums and social media platforms are increasingly being used to discuss topics of varying polarities where different people take different stances.
Several methodologies for automatic stance detection from text have been proposed in literature.
To our knowledge, there has not been any systematic investigation towards their, and their comparative performances.
arXiv Detail & Related papers (2020-07-12T12:39:35Z) - Image Segmentation Using Deep Learning: A Survey [58.37211170954998]
Image segmentation is a key topic in image processing and computer vision.
There has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models.
arXiv Detail & Related papers (2020-01-15T21:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.