NVS-MonoDepth: Improving Monocular Depth Prediction with Novel View
Synthesis
- URL: http://arxiv.org/abs/2112.12577v1
- Date: Wed, 22 Dec 2021 12:21:08 GMT
- Title: NVS-MonoDepth: Improving Monocular Depth Prediction with Novel View
Synthesis
- Authors: Zuria Bauer and Zuoyue Li and Sergio Orts-Escolano and Miguel Cazorla
and Marc Pollefeys and Martin R. Oswald
- Abstract summary: We propose a novel training method split in three main steps to improve monocular depth estimation.
Experimental results prove that our method achieves state-of-the-art or comparable performance on the KITTI and NYU-Depth-v2 datasets.
- Score: 74.4983052902396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Building upon the recent progress in novel view synthesis, we propose its
application to improve monocular depth estimation. In particular, we propose a
novel training method split in three main steps. First, the prediction results
of a monocular depth network are warped to an additional view point. Second, we
apply an additional image synthesis network, which corrects and improves the
quality of the warped RGB image. The output of this network is required to look
as similar as possible to the ground-truth view by minimizing the pixel-wise
RGB reconstruction error. Third, we reapply the same monocular depth estimation
onto the synthesized second view point and ensure that the depth predictions
are consistent with the associated ground truth depth. Experimental results
prove that our method achieves state-of-the-art or comparable performance on
the KITTI and NYU-Depth-v2 datasets with a lightweight and simple vanilla U-Net
architecture.
Related papers
- Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian [49.21866794516328]
3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis.
Previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting.
We introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates.
arXiv Detail & Related papers (2024-05-30T03:18:30Z) - Single-View View Synthesis in the Wild with Learned Adaptive Multiplane
Images [15.614631883233898]
Existing methods have shown promising results leveraging monocular depth estimation and color inpainting with layered depth representations.
We propose a new method based on the multiplane image (MPI) representation.
The experiments on both synthetic and real datasets demonstrate that our trained model works remarkably well and achieves state-of-the-art results.
arXiv Detail & Related papers (2022-05-24T02:57:16Z) - Monocular Depth Estimation Primed by Salient Point Detection and
Normalized Hessian Loss [43.950140695759764]
We propose an accurate and lightweight framework for monocular depth estimation based on a self-attention mechanism stemming from salient point detection.
We introduce a normalized Hessian loss term invariant to scaling and shear along the depth direction, which is shown to substantially improve the accuracy.
The proposed method achieves state-of-the-art results on NYU-Depth-v2 and KITTI while using 3.1-38.4 times smaller model in terms of the number of parameters than baseline approaches.
arXiv Detail & Related papers (2021-08-25T07:51:09Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - Unpaired Single-Image Depth Synthesis with cycle-consistent Wasserstein
GANs [1.0499611180329802]
Real-time estimation of actual environment depth is an essential module for various autonomous system tasks.
In this study, latest advancements in the field of generative neural networks are leveraged to fully unsupervised single-image depth synthesis.
arXiv Detail & Related papers (2021-03-31T09:43:38Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - RGBD-Net: Predicting color and depth images for novel views synthesis [46.233701784858184]
RGBD-Net is proposed to predict the depth map and the color images at the target pose in a multi-scale manner.
The results indicate that RGBD-Net generalizes well to previously unseen data.
arXiv Detail & Related papers (2020-11-29T16:42:53Z) - Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction [72.30870535815258]
CNNs for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment.
We propose a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM.
On the other hand, the bundle-adjusted 3D scene structures and camera poses from the more principled geometric SLAM are injected back into the depth network through novel wide baseline losses.
arXiv Detail & Related papers (2020-04-22T16:31:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.