Robust Consistent Video Depth Estimation
- URL: http://arxiv.org/abs/2012.05901v1
- Date: Thu, 10 Dec 2020 18:59:48 GMT
- Title: Robust Consistent Video Depth Estimation
- Authors: Johannes Kopf, Xuejian Rong, Jia-Bin Huang
- Abstract summary: We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
- Score: 65.53308117778361
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an algorithm for estimating consistent dense depth maps and camera
poses from a monocular video. We integrate a learning-based depth prior, in the
form of a convolutional neural network trained for single-image depth
estimation, with geometric optimization, to estimate a smooth camera trajectory
as well as detailed and stable depth reconstruction. Our algorithm combines two
complementary techniques: (1) flexible deformation-splines for low-frequency
large-scale alignment and (2) geometry-aware depth filtering for high-frequency
alignment of fine depth details. In contrast to prior approaches, our method
does not require camera poses as input and achieves robust reconstruction for
challenging hand-held cell phone captures containing a significant amount of
noise, shake, motion blur, and rolling shutter deformations. Our method
quantitatively outperforms state-of-the-arts on the Sintel benchmark for both
depth and pose estimations and attains favorable qualitative results across
diverse wild datasets.
Related papers
- Robust and Flexible Omnidirectional Depth Estimation with Multiple 360° Cameras [8.850391039025077]
We use geometric constraints and redundant information of multiple 360-degree cameras to achieve robust and flexible omnidirectional depth estimation.
Our two algorithms achieve state-of-the-art performance, accurately predicting depth maps even when provided with soiled panorama inputs.
arXiv Detail & Related papers (2024-09-23T07:31:48Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - Multi-Camera Collaborative Depth Prediction via Consistent Structure
Estimation [75.99435808648784]
We propose a novel multi-camera collaborative depth prediction method.
It does not require large overlapping areas while maintaining structure consistency between cameras.
Experimental results on DDAD and NuScenes datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2022-10-05T03:44:34Z) - Uncertainty Guided Depth Fusion for Spike Camera [49.41822923588663]
We propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse predictions of monocular and stereo depth estimation networks for spike camera.
Our framework is motivated by the fact that stereo spike depth estimation achieves better results at close range.
In order to demonstrate the advantage of spike depth estimation over traditional camera depth estimation, we contribute a spike-depth dataset named CitySpike20K.
arXiv Detail & Related papers (2022-08-26T13:04:01Z) - Boosting Monocular Depth Estimation Models to High-Resolution via
Content-Adaptive Multi-Resolution Merging [14.279471205248534]
We show how a consistent scene structure and high-frequency details affect depth estimation performance.
We present a double estimation method that improves the whole-image depth estimation and a patch selection method that adds local details.
We demonstrate that by merging estimations at different resolutions with changing context, we can generate multi-megapixel depth maps with a high level of detail.
arXiv Detail & Related papers (2021-05-28T17:55:15Z) - Deep Two-View Structure-from-Motion Revisited [83.93809929963969]
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.
We propose to revisit the problem of deep two-view SfM by leveraging the well-posedness of the classic pipeline.
Our method consists of 1) an optical flow estimation network that predicts dense correspondences between two frames; 2) a normalized pose estimation module that computes relative camera poses from the 2D optical flow correspondences, and 3) a scale-invariant depth estimation network that leverages epipolar geometry to reduce the search space, refine the dense correspondences, and estimate relative depth maps.
arXiv Detail & Related papers (2021-04-01T15:31:20Z) - DeepRelativeFusion: Dense Monocular SLAM using Single-Image Relative
Depth Prediction [4.9188958016378495]
We propose a dense monocular SLAM system, named DeepFusion, that is capable of recovering a globally consistent 3D structure.
We use a visual SLAM to reliably recover the camera poses and semi-dense maps of depth thes, and then use relative depth prediction to densify the semi-dense depth maps and refine the pose-graph.
Our system outperforms the state-of-the-art dense SLAM systems quantitatively in dense reconstruction accuracy by a large margin.
arXiv Detail & Related papers (2020-06-07T05:22:29Z) - Video Depth Estimation by Fusing Flow-to-Depth Proposals [65.24533384679657]
We present an approach with a differentiable flow-to-depth layer for video depth estimation.
The model consists of a flow-to-depth layer, a camera pose refinement module, and a depth fusion network.
Our approach outperforms state-of-the-art depth estimation methods, and has reasonable cross dataset generalization capability.
arXiv Detail & Related papers (2019-12-30T10:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.