A Comparative Study on Deep-Learning Methods for Dense Image Matching of
Multi-angle and Multi-date Remote Sensing Stereo Images
- URL: http://arxiv.org/abs/2210.14031v1
- Date: Tue, 25 Oct 2022 14:10:04 GMT
- Title: A Comparative Study on Deep-Learning Methods for Dense Image Matching of
Multi-angle and Multi-date Remote Sensing Stereo Images
- Authors: Hessah Albanwan, Rongjun Qin
- Abstract summary: This paper provides an evaluation of four deep learning (DL) stereo matching methods through hundreds of multi-date multi-site satellite stereo pairs.
Our experiments show that E2E algorithms can achieve upper limits of geometric accuracies, while may not generalize well for unseen data.
All DL algorithms are robust to geometric configurations of stereo pairs and are less sensitive in comparison to the Census-SGM.
- Score: 1.0152838128195467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning (DL) stereo matching methods gained great attention in remote
sensing satellite datasets. However, most of these existing studies conclude
assessments based only on a few/single stereo images lacking a systematic
evaluation on how robust DL methods are on satellite stereo images with varying
radiometric and geometric configurations. This paper provides an evaluation of
four DL stereo matching methods through hundreds of multi-date multi-site
satellite stereo pairs with varying geometric configurations, against the
traditional well-practiced Census-SGM (Semi-global matching), to
comprehensively understand their accuracy, robustness, generalization
capabilities, and their practical potential. The DL methods include a
learning-based cost metric through convolutional neural networks (MC-CNN)
followed by SGM, and three end-to-end (E2E) learning models using Geometry and
Context Network (GCNet), Pyramid Stereo Matching Network (PSMNet), and
LEAStereo. Our experiments show that E2E algorithms can achieve upper limits of
geometric accuracies, while may not generalize well for unseen data. The
learning-based cost metric and Census-SGM are rather robust and can
consistently achieve acceptable results. All DL algorithms are robust to
geometric configurations of stereo pairs and are less sensitive in comparison
to the Census-SGM, while learning-based cost metrics can generalize on
satellite images when trained on different datasets (airborne or ground-view).
Related papers
- Analysis of different disparity estimation techniques on aerial stereo image datasets [0.0]
This work analyses dense stereo correspondence analysis on aerial images using different techniques.
For traditional methods, we implemented the architecture of Stereo SGBM while using different cost functions.
Analysis of most of the methods in standard datasets has shown good performance, however in case of aerial dataset, not much benchmarking is available.
arXiv Detail & Related papers (2024-10-09T09:33:48Z) - Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on
Dataset Mixtures with Uncalibrated Stereo Data [4.199844472131922]
We propose GP$2$, General-Purpose and Geometry-Preserving training scheme for single-view depth estimation.
We show that GP$2$-trained models outperform methods relying on PCM in both accuracy and speed.
We also show that SVDE models can learn to predict geometrically correct depth even when geometrically complete data comprises the minor part of the training set.
arXiv Detail & Related papers (2023-06-05T13:49:24Z) - A Model-data-driven Network Embedding Multidimensional Features for
Tomographic SAR Imaging [5.489791364472879]
We propose a new model-data-driven network to achieve tomoSAR imaging based on multi-dimensional features.
We add two 2D processing modules, both convolutional encoder-decoder structures, to enhance multi-dimensional features of the imaging scene effectively.
Compared with the conventional CS-based FISTA method and DL-based gamma-Net method, the result of our proposed method has better performance on completeness while having decent imaging accuracy.
arXiv Detail & Related papers (2022-11-28T02:01:43Z) - Fine-tuning deep learning models for stereo matching using results from
semi-global matching [1.0152838128195467]
Deep learning (DL) methods are widely investigated for stereo image matching tasks due to their reported high accuracies.
With satellite images covering large-scale areas with variances in locations, content, land covers, and spatial patterns, we expect their performances to be impacted.
We propose a finetuning method that takes advantage of disparity maps derived from Census-based semi-global-matching (SGM) on target stereo data.
arXiv Detail & Related papers (2022-05-27T15:38:10Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Towards Interpretable Deep Metric Learning with Structural Matching [86.16700459215383]
We present a deep interpretable metric learning (DIML) method for more transparent embedding learning.
Our method is model-agnostic, which can be applied to off-the-shelf backbone networks and metric learning methods.
We evaluate our method on three major benchmarks of deep metric learning including CUB200-2011, Cars196, and Stanford Online Products.
arXiv Detail & Related papers (2021-08-12T17:59:09Z) - ResDepth: A Deep Prior For 3D Reconstruction From High-resolution
Satellite Images [28.975837416508142]
We introduce ResDepth, a convolutional neural network that learns such an expressive geometric prior from example data.
In a series of experiments, we find that the proposed method consistently improves stereo DSMs both quantitatively and qualitatively.
We show that the prior encoded in the network weights captures meaningful geometric characteristics of urban design.
arXiv Detail & Related papers (2021-06-15T12:51:28Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Improving Deep Stereo Network Generalization with Geometric Priors [93.09496073476275]
Large datasets of diverse real-world scenes with dense ground truth are difficult to obtain.
Many algorithms rely on small real-world datasets of similar scenes or synthetic datasets.
We propose to incorporate prior knowledge of scene geometry into an end-to-end stereo network to help networks generalize better.
arXiv Detail & Related papers (2020-08-25T15:24:02Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.