Related papers: PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth Estimation

PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth Estimation

URL: http://arxiv.org/abs/2202.01323v1
Date: Wed, 2 Feb 2022 23:08:06 GMT
Title: PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth Estimation
Authors: Yuyan Li, Zhixin Yan, Ye Duan, Liu Ren
Abstract summary: We propose a novel, model-agnostic, two-stage pipeline for omnidirectional monocular depth estimation. Our framework PanoDepth takes one 360 image as input, produces one or more synthesized views in the first stage, and feeds the original image and the synthesized images into the subsequent stereo matching stage. Our results show that PanoDepth outperforms the state-of-the-art approaches by a large margin for 360 monocular depth estimation.
Score: 11.66493799838823
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Omnidirectional 3D information is essential for a wide range of applications such as Virtual Reality, Autonomous Driving, Robotics, etc. In this paper, we propose a novel, model-agnostic, two-stage pipeline for omnidirectional monocular depth estimation. Our proposed framework PanoDepth takes one 360 image as input, produces one or more synthesized views in the first stage, and feeds the original image and the synthesized images into the subsequent stereo matching stage. In the second stage, we propose a differentiable Spherical Warping Layer to handle omnidirectional stereo geometry efficiently and effectively. By utilizing the explicit stereo-based geometric constraints in the stereo matching stage, PanoDepth can generate dense high-quality depth. We conducted extensive experiments and ablation studies to evaluate PanoDepth with both the full pipeline as well as the individual modules in each stage. Our results show that PanoDepth outperforms the state-of-the-art approaches by a large margin for 360 monocular depth estimation.

Related papers

DreamCube: 3D Panorama Generation via Multi-plane Synchronization [17.690754213112108]
3D panorama synthesis is a promising yet challenging task that demands high-quality and diverse visual appearance and geometry of the generated omnidirectional content.<n>Existing methods leverage rich image priors from pre-trained 2D foundation models to circumvent the scarcity of 3D panoramic data.<n>In this work, we demonstrate that by applying multi-plane synchronization to the operators from 2D foundation models, their capabilities can be seamlessly extended to the omnidirectional domain.
arXiv Detail & Related papers (2025-06-20T17:55:06Z)
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model [62.37493746544967]
Camera-based setups offer a cost-effective option by using stereo depth estimation to generate dense, high-resolution depth maps. Existing omnidirectional stereo matching approaches achieve only limited depth accuracy across diverse environments. We present DFI-OmniStereo, a novel omnidirectional stereo matching method that leverages a large-scale pre-trained foundation model for relative monocular depth estimation.
arXiv Detail & Related papers (2025-03-30T16:24:22Z)
Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation [83.841877607646]
We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation. The dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
arXiv Detail & Related papers (2024-11-27T13:34:41Z)
Robust and Flexible Omnidirectional Depth Estimation with Multiple 360° Cameras [8.850391039025077]
We use geometric constraints and redundant information of multiple 360-degree cameras to achieve robust and flexible omnidirectional depth estimation. Our two algorithms achieve state-of-the-art performance, accurately predicting depth maps even when provided with soiled panorama inputs.
arXiv Detail & Related papers (2024-09-23T07:31:48Z)
MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from Multi-Cylindrical Panoramas [26.686883000660437]
Multi-Cylindrical Panoramic Depth Estimation (MCPDepth) State-of-the-art performance with an 18.8% reduction in mean absolute error (MAE) for depth on the outdoor synthetic dataset Deep360. A 19.9% reduction on the indoor real-scene dataset 3D60.
arXiv Detail & Related papers (2024-08-03T03:35:37Z)
MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images. We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z)
MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis. We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images. Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z)
Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo-Stereo Supervision [40.58193195996798]
We propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images. The two 360deg images formulate a stereo pair with a special pose, and the photometric consistency is leveraged to establish the unsupervised constraint. Experiments exhibit that the performance of our unsupervised solution is competitive to that of the state-of-the-art supervised methods.
arXiv Detail & Related papers (2023-02-20T11:35:55Z)
SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras. Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views. In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z)
DSGN++: Exploiting Visual-Spatial Relation forStereo-based 3D Detectors [60.88824519770208]
Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors. We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics. We propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline.
arXiv Detail & Related papers (2022-04-06T18:43:54Z)
Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis [72.09193030350396]
We study the problem of view synthesis with certain amount of rotations from a pair of images, what we called stereo unstructured magnification. We propose a novel multiple homography image representation, comprising of a set of scene planes with fixed normals and distances. We derive an angle-based cost to guide the blending of multi-normal images by exploiting per-normal geometry.
arXiv Detail & Related papers (2022-04-01T01:39:28Z)
Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS) We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry. Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z)
OmniSLAM: Omnidirectional Localization and Dense Mapping for Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras. For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation. We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.