Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in
Dynamic Environments
- URL: http://arxiv.org/abs/2201.08633v1
- Date: Fri, 21 Jan 2022 10:42:57 GMT
- Title: Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in
Dynamic Environments
- Authors: Christian Homeyer, Oliver Lange, Christoph Schn\"orr
- Abstract summary: 3D reconstruction of depth and motion from monocular video in dynamic environments is a highly ill-posed problem.
We investigate the performance of the current State-of-the-Art (SotA) deep multi-view systems in such environments.
- Score: 0.2426580753117204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D reconstruction of depth and motion from monocular video in dynamic
environments is a highly ill-posed problem due to scale ambiguities when
projecting to the 2D image domain. In this work, we investigate the performance
of the current State-of-the-Art (SotA) deep multi-view systems in such
environments. We find that current supervised methods work surprisingly well
despite not modelling individual object motions, but make systematic errors due
to a lack of dense ground truth data. To detect such errors during usage, we
extend the cost volume based Deep Video to Depth (DeepV2D) framework
\cite{teed2018deepv2d} with a learned uncertainty. Our Deep Video to certain
Depth (DeepV2cD) model allows i) to perform en par or better with current SotA
and ii) achieve a better uncertainty measure than the naive Shannon entropy.
Our experiments show that a simple filter strategy based on the uncertainty can
significantly reduce systematic errors. This results in cleaner reconstructions
both on static and dynamic parts of the scene.
Related papers
- DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.
This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.
Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - D$^3$epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes [23.731667977542454]
D$3$epth is a novel method for self-supervised depth estimation in dynamic scenes.
It tackles the challenge of dynamic objects from two key perspectives.
It consistently outperforms existing self-supervised monocular depth estimation baselines.
arXiv Detail & Related papers (2024-11-07T16:07:00Z) - DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - Manydepth2: Motion-Aware Self-Supervised Multi-Frame Monocular Depth Estimation in Dynamic Scenes [45.070725750859786]
We present Manydepth2, to achieve precise depth estimation for both dynamic objects and static backgrounds.
To tackle the challenges posed by dynamic content, we incorporate optical flow and coarse monocular depth to create a pseudo-static reference frame.
This frame is then utilized to build a motion-aware cost volume in collaboration with the vanilla target frame.
arXiv Detail & Related papers (2023-12-23T14:36:27Z) - Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes [40.46121828229776]
Dynamo-Depth is an approach that disambiguates dynamical motion by jointly learning monocular depth, 3D independent flow field, and motion segmentation from unlabeled monocular videos.
Our proposed method achieves state-of-the-art performance on monocular depth estimation on Open and nuScenes with significant improvement in the depth of moving objects.
arXiv Detail & Related papers (2023-10-29T03:24:16Z) - MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in
Adverse Scenes [49.21187418886508]
This paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP.
We first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors.
Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth.
arXiv Detail & Related papers (2023-05-18T13:42:02Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection
with Dynamic Temporal Stereo [15.479670314689418]
We introduce an effective temporal stereo method to dynamically select the scale of matching candidates.
We design an iterative algorithm to update more valuable candidates, making it adaptive to moving candidates.
BEVStereo achieves the new state-of-the-art performance on the camera-only track of nuScenes dataset.
arXiv Detail & Related papers (2022-09-21T10:21:25Z) - Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection.
Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon.
Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z) - Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes [87.91841050957714]
We present an unsupervised monocular framework for dense depth estimation of dynamic scenes.
We derive a training objective that aims to opportunistically preserve pairwise distances between reconstructed 3D points.
Our method provides promising results, demonstrating its capability of reconstructing 3D from challenging videos of non-rigid scenes.
arXiv Detail & Related papers (2020-12-31T16:02:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.