VA-DepthNet: A Variational Approach to Single Image Depth Prediction
- URL: http://arxiv.org/abs/2302.06556v2
- Date: Wed, 15 Feb 2023 21:17:03 GMT
- Title: VA-DepthNet: A Variational Approach to Single Image Depth Prediction
- Authors: Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc Van Gool
- Abstract summary: VA-DepthNet is a simple, effective, and accurate deep neural network approach for the single-image depth prediction problem.
The paper demonstrates the usefulness of the proposed approach via extensive evaluation and ablation analysis over several benchmark datasets.
- Score: 163.14849753700682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce VA-DepthNet, a simple, effective, and accurate deep neural
network approach for the single-image depth prediction (SIDP) problem. The
proposed approach advocates using classical first-order variational constraints
for this problem. While state-of-the-art deep neural network methods for SIDP
learn the scene depth from images in a supervised setting, they often overlook
the invaluable invariances and priors in the rigid scene space, such as the
regularity of the scene. The paper's main contribution is to reveal the benefit
of classical and well-founded variational constraints in the neural network
design for the SIDP task. It is shown that imposing first-order variational
constraints in the scene space together with popular encoder-decoder-based
network architecture design provides excellent results for the supervised SIDP
task. The imposed first-order variational constraint makes the network aware of
the depth gradient in the scene space, i.e., regularity. The paper demonstrates
the usefulness of the proposed approach via extensive evaluation and ablation
analysis over several benchmark datasets, such as KITTI, NYU Depth V2, and SUN
RGB-D. The VA-DepthNet at test time shows considerable improvements in depth
prediction accuracy compared to the prior art and is accurate also at
high-frequency regions in the scene space. At the time of writing this paper,
our method -- labeled as VA-DepthNet, when tested on the KITTI depth-prediction
evaluation set benchmarks, shows state-of-the-art results, and is the
top-performing published approach.
Related papers
- A Confidence-based Iterative Solver of Depths and Surface Normals for
Deep Multi-view Stereo [41.527018997251744]
We introduce a deep multi-view stereo (MVS) system that jointly predicts depths, surface normals and per-view confidence maps.
The key to our approach is a novel solver that iteratively solves for per-view depth map and normal map.
Our proposed solver consistently improves the depth quality over both conventional and deep learning based MVS pipelines.
arXiv Detail & Related papers (2022-01-19T14:08:45Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor
Multi-view Stereo [97.07453889070574]
We present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors.
We show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes.
arXiv Detail & Related papers (2021-09-02T17:54:31Z) - Monocular Depth Estimation Primed by Salient Point Detection and
Normalized Hessian Loss [43.950140695759764]
We propose an accurate and lightweight framework for monocular depth estimation based on a self-attention mechanism stemming from salient point detection.
We introduce a normalized Hessian loss term invariant to scaling and shear along the depth direction, which is shown to substantially improve the accuracy.
The proposed method achieves state-of-the-art results on NYU-Depth-v2 and KITTI while using 3.1-38.4 times smaller model in terms of the number of parameters than baseline approaches.
arXiv Detail & Related papers (2021-08-25T07:51:09Z) - PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View
Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net.
Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z) - Deep Semantic Matching with Foreground Detection and Cycle-Consistency [103.22976097225457]
We address weakly supervised semantic matching based on a deep network.
We explicitly estimate the foreground regions to suppress the effect of background clutter.
We develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent.
arXiv Detail & Related papers (2020-03-31T22:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.