Advances and Challenges in Multimodal Remote Sensing Image Registration
- URL: http://arxiv.org/abs/2302.00912v2
- Date: Sun, 5 Feb 2023 07:59:59 GMT
- Title: Advances and Challenges in Multimodal Remote Sensing Image Registration
- Authors: Bai Zhu, Liang Zhou, Simiao Pu, Jianwei Fan, Yuanxin Ye
- Abstract summary: multimodal remote sensing image registration is a crucial step for integrating the complementary information among multimodal data and making comprehensive observations and analysis of the Earths surface.
We will present our own contributions to the field of multimodal image registration, summarize the advantages and limitations of existing multimodal image registration methods, and then discuss the remaining challenges.
- Score: 4.212031520823376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the past few decades, with the rapid development of global aerospace and
aerial remote sensing technology, the types of sensors have evolved from the
traditional monomodal sensors (e.g., optical sensors) to the new generation of
multimodal sensors [e.g., multispectral, hyperspectral, light detection and
ranging (LiDAR) and synthetic aperture radar (SAR) sensors]. These advanced
devices can dynamically provide various and abundant multimodal remote sensing
images with different spatial, temporal, and spectral resolutions according to
different application requirements. Since then, it is of great scientific
significance to carry out the research of multimodal remote sensing image
registration, which is a crucial step for integrating the complementary
information among multimodal data and making comprehensive observations and
analysis of the Earths surface. In this work, we will present our own
contributions to the field of multimodal image registration, summarize the
advantages and limitations of existing multimodal image registration methods,
and then discuss the remaining challenges and make a forward-looking prospect
for the future development of the field.
Related papers
- SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images [17.304106221330414]
Recent deep learning models are often specific to single sensors or to fixed combinations.<n>We introduce SMARTIES, a generic and versatile foundation model lifting sensor-specific/dependent efforts.<n>On both single and multi-modal tasks across diverse sensors, SMARTIES outperforms previous models that rely on sensor-specific pretraining.
arXiv Detail & Related papers (2025-06-24T12:51:39Z) - Multi-modal Multi-platform Person Re-Identification: Benchmark and Method [58.59888754340054]
MP-ReID is a novel dataset designed specifically for multi-modality and multi-platform ReID.
This benchmark compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging.
We introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios.
arXiv Detail & Related papers (2025-03-21T12:27:49Z) - MSSIDD: A Benchmark for Multi-Sensor Denoising [55.41612200877861]
We introduce a new benchmark, the Multi-Sensor SIDD dataset, which is the first raw-domain dataset designed to evaluate the sensor transferability of denoising models.
We propose a sensor consistency training framework that enables denoising models to learn the sensor-invariant features.
arXiv Detail & Related papers (2024-11-18T13:32:59Z) - MAROON: A Framework for the Joint Characterization of Near-Field High-Resolution Radar and Optical Depth Imaging Techniques [4.816237933371206]
We take on the unique challenge of characterizing depth imagers from both, the optical and radio-frequency domain.
We provide a comprehensive evaluation of their depth measurements with respect to distinct object materials, geometries, and object-to-sensor distances.
All object measurements will be made public in form of a multimodal dataset, called MAROON.
arXiv Detail & Related papers (2024-11-01T11:53:10Z) - Foundation Models for Remote Sensing and Earth Observation: A Survey [101.77425018347557]
This survey systematically reviews the emerging field of Remote Sensing Foundation Models (RSFMs)
It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts.
We benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions.
arXiv Detail & Related papers (2024-10-22T01:08:21Z) - SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models [43.79587815909473]
This paper aims to establish a multi-vision Sensor Perception And Reasoning benchmarK called SPARK.
We generated 6,248 vision-language test samples to investigate multi-vision sensory perception and multi-vision sensory reasoning on physical sensor knowledge proficiency.
Results showed that most models displayed deficiencies in multi-vision sensory reasoning to varying extents.
arXiv Detail & Related papers (2024-08-22T03:59:48Z) - Motion Capture from Inertial and Vision Sensors [60.5190090684795]
MINIONS is a large-scale Motion capture dataset collected from INertial and visION Sensors.
We conduct experiments on multi-modal motion capture using a monocular camera and very few IMUs.
arXiv Detail & Related papers (2024-07-23T09:41:10Z) - Bridging Remote Sensors with Multisensor Geospatial Foundation Models [15.289711240431107]
msGFM is a multisensor geospatial foundation model that unifies data from four key sensor modalities.
For data originating from identical geolocations, our model employs an innovative cross-sensor pretraining approach.
msGFM has demonstrated enhanced proficiency in a range of both single-sensor and multisensor downstream tasks.
arXiv Detail & Related papers (2024-04-01T17:30:56Z) - LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for
Place Recognition [11.206532393178385]
We present a novel neural network named LCPR for robust multimodal place recognition.
Our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance.
arXiv Detail & Related papers (2023-11-06T15:39:48Z) - Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images [1.662438436885552]
Multi-modal fusion has been determined to enhance the accuracy by fusing data from multiple modalities.
We propose a novel multi-modal fusion strategy for mapping relationships between different channels at the early stage.
By addressing fusion in the early stage, as opposed to mid or late-stage methods, our method achieves competitive and even superior performance compared to existing techniques.
arXiv Detail & Related papers (2023-10-21T00:56:11Z) - Learning Online Multi-Sensor Depth Fusion [100.84519175539378]
SenFuNet is a depth fusion approach that learns sensor-specific noise and outlier statistics.
We conduct experiments with various sensor combinations on the real-world CoRBS and Scene3D datasets.
arXiv Detail & Related papers (2022-04-07T10:45:32Z) - Multimodal Image Synthesis and Editing: The Generative AI Era [131.9569600472503]
multimodal image synthesis and editing has become a hot research topic in recent years.
We comprehensively contextualize the advance of the recent multimodal image synthesis and editing.
We describe benchmark datasets and evaluation metrics as well as corresponding experimental results.
arXiv Detail & Related papers (2021-12-27T10:00:16Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.