PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth
Estimation
- URL: http://arxiv.org/abs/2202.01323v1
- Date: Wed, 2 Feb 2022 23:08:06 GMT
- Title: PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth
Estimation
- Authors: Yuyan Li, Zhixin Yan, Ye Duan, Liu Ren
- Abstract summary: We propose a novel, model-agnostic, two-stage pipeline for omnidirectional monocular depth estimation.
Our framework PanoDepth takes one 360 image as input, produces one or more synthesized views in the first stage, and feeds the original image and the synthesized images into the subsequent stereo matching stage.
Our results show that PanoDepth outperforms the state-of-the-art approaches by a large margin for 360 monocular depth estimation.
- Score: 11.66493799838823
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Omnidirectional 3D information is essential for a wide range of applications
such as Virtual Reality, Autonomous Driving, Robotics, etc. In this paper, we
propose a novel, model-agnostic, two-stage pipeline for omnidirectional
monocular depth estimation. Our proposed framework PanoDepth takes one 360
image as input, produces one or more synthesized views in the first stage, and
feeds the original image and the synthesized images into the subsequent stereo
matching stage. In the second stage, we propose a differentiable Spherical
Warping Layer to handle omnidirectional stereo geometry efficiently and
effectively. By utilizing the explicit stereo-based geometric constraints in
the stereo matching stage, PanoDepth can generate dense high-quality depth. We
conducted extensive experiments and ablation studies to evaluate PanoDepth with
both the full pipeline as well as the individual modules in each stage. Our
results show that PanoDepth outperforms the state-of-the-art approaches by a
large margin for 360 monocular depth estimation.
Related papers
- Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model [62.37493746544967]
Camera-based setups offer a cost-effective option by using stereo depth estimation to generate dense, high-resolution depth maps.
Existing omnidirectional stereo matching approaches achieve only limited depth accuracy across diverse environments.
We present DFI-OmniStereo, a novel omnidirectional stereo matching method that leverages a large-scale pre-trained foundation model for relative monocular depth estimation.
arXiv Detail & Related papers (2025-03-30T16:24:22Z) - Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation [83.841877607646]
We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation.
The dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images.
We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
arXiv Detail & Related papers (2024-11-27T13:34:41Z) - Robust and Flexible Omnidirectional Depth Estimation with Multiple 360° Cameras [8.850391039025077]
We use geometric constraints and redundant information of multiple 360-degree cameras to achieve robust and flexible omnidirectional depth estimation.
Our two algorithms achieve state-of-the-art performance, accurately predicting depth maps even when provided with soiled panorama inputs.
arXiv Detail & Related papers (2024-09-23T07:31:48Z) - MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from Multi-Cylindrical Panoramas [26.686883000660437]
Multi-Cylindrical Panoramic Depth Estimation (MCPDepth)
State-of-the-art performance with an 18.8% reduction in mean absolute error (MAE) for depth on the outdoor synthetic dataset Deep360.
A 19.9% reduction on the indoor real-scene dataset 3D60.
arXiv Detail & Related papers (2024-08-03T03:35:37Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis.
We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images.
Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z) - Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via
Establishing Pseudo-Stereo Supervision [40.58193195996798]
We propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images.
The two 360deg images formulate a stereo pair with a special pose, and the photometric consistency is leveraged to establish the unsupervised constraint.
Experiments exhibit that the performance of our unsupervised solution is competitive to that of the state-of-the-art supervised methods.
arXiv Detail & Related papers (2023-02-20T11:35:55Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - DSGN++: Exploiting Visual-Spatial Relation forStereo-based 3D Detectors [60.88824519770208]
Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors.
We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics.
We propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline.
arXiv Detail & Related papers (2022-04-06T18:43:54Z) - Stereo Unstructured Magnification: Multiple Homography Image for View
Synthesis [72.09193030350396]
We study the problem of view synthesis with certain amount of rotations from a pair of images, what we called stereo unstructured magnification.
We propose a novel multiple homography image representation, comprising of a set of scene planes with fixed normals and distances.
We derive an angle-based cost to guide the blending of multi-normal images by exploiting per-normal geometry.
arXiv Detail & Related papers (2022-04-01T01:39:28Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.