Panoramic Depth Estimation via Supervised and Unsupervised Learning in
Indoor Scenes
- URL: http://arxiv.org/abs/2108.08076v1
- Date: Wed, 18 Aug 2021 09:58:44 GMT
- Title: Panoramic Depth Estimation via Supervised and Unsupervised Learning in
Indoor Scenes
- Authors: Keyang Zhou, Kailun Yang, Kaiwei Wang
- Abstract summary: We introduce panoramic images to obtain larger field of view.
We improve the training process of the neural network adapted to the characteristics of panoramic images.
With a comprehensive variety of experiments, this research demonstrates the effectiveness of our schemes aiming for indoor scene perception.
- Score: 8.48364407942494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth estimation, as a necessary clue to convert 2D images into the 3D space,
has been applied in many machine vision areas. However, to achieve an entire
surrounding 360-degree geometric sensing, traditional stereo matching
algorithms for depth estimation are limited due to large noise, low accuracy,
and strict requirements for multi-camera calibration. In this work, for a
unified surrounding perception, we introduce panoramic images to obtain larger
field of view. We extend PADENet first appeared in our previous conference work
for outdoor scene understanding, to perform panoramic monocular depth
estimation with a focus for indoor scenes. At the same time, we improve the
training process of the neural network adapted to the characteristics of
panoramic images. In addition, we fuse traditional stereo matching algorithm
with deep learning methods and further improve the accuracy of depth
predictions. With a comprehensive variety of experiments, this research
demonstrates the effectiveness of our schemes aiming for indoor scene
perception.
Related papers
- Robust and Flexible Omnidirectional Depth Estimation with Multiple 360° Cameras [8.850391039025077]
We use geometric constraints and redundant information of multiple 360-degree cameras to achieve robust and flexible omnidirectional depth estimation.
Our two algorithms achieve state-of-the-art performance, accurately predicting depth maps even when provided with soiled panorama inputs.
arXiv Detail & Related papers (2024-09-23T07:31:48Z) - MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis.
We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images.
Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Calibrating Panoramic Depth Estimation for Practical Localization and
Mapping [20.621442016969976]
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation.
We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information.
arXiv Detail & Related papers (2023-08-27T04:50:05Z) - Blur aware metric depth estimation with multi-focus plenoptic cameras [8.508198765617196]
We present a new metric depth estimation algorithm using only raw images from a multi-focus plenoptic camera.
The proposed approach is especially suited for the multi-focus configuration where several micro-lenses with different focal lengths are used.
arXiv Detail & Related papers (2023-08-08T13:38:50Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes [68.38952377590499]
We present a novel approach for estimating depth from a monocular camera as it moves through complex indoor environments.
Our approach predicts absolute scale depth maps over the entire scene consisting of a static background and multiple moving people.
arXiv Detail & Related papers (2021-08-12T09:12:39Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z) - DFVS: Deep Flow Guided Scene Agnostic Image Based Visual Servoing [11.000164408890635]
Existing deep learning based visual servoing approaches regress the relative camera pose between a pair of images.
We consider optical flow as our visual features, which are predicted using a deep neural network.
We show convergence for over 3m and 40 degrees while maintaining precise positioning of under 2cm and 1 degree.
arXiv Detail & Related papers (2020-03-08T11:42:36Z) - Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment.
We train a specialized global-local network architecture with what would be available to a robot interacting with the environment.
Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.