Joint Prediction of Monocular Depth and Structure using Planar and
Parallax Geometry
- URL: http://arxiv.org/abs/2207.06351v1
- Date: Wed, 13 Jul 2022 17:04:05 GMT
- Title: Joint Prediction of Monocular Depth and Structure using Planar and
Parallax Geometry
- Authors: Hao Xing, Yifan Cao, Maximilian Biber, Mingchuan Zhou, Darius Burschka
- Abstract summary: Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data.
We propose a novel approach combining structure information from a promising Plane and Parallax geometry pipeline with depth information into a U-Net supervised learning network.
Our model has impressive performance on depth prediction of thin objects and edges, and compared to structure prediction baseline, our model performs more robustly.
- Score: 4.620624344434533
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised learning depth estimation methods can achieve good performance
when trained on high-quality ground-truth, like LiDAR data. However, LiDAR can
only generate sparse 3D maps which causes losing information. Obtaining
high-quality ground-truth depth data per pixel is difficult to acquire. In
order to overcome this limitation, we propose a novel approach combining
structure information from a promising Plane and Parallax geometry pipeline
with depth information into a U-Net supervised learning network, which results
in quantitative and qualitative improvement compared to existing popular
learning-based methods. In particular, the model is evaluated on two
large-scale and challenging datasets: KITTI Vision Benchmark and Cityscapes
dataset and achieve the best performance in terms of relative error. Compared
with pure depth supervision models, our model has impressive performance on
depth prediction of thin objects and edges, and compared to structure
prediction baseline, our model performs more robustly.
Related papers
- DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation.
We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features.
We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z) - Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation [38.81275292687583]
We propose Plane2Depth, which adaptively utilizes plane information to improve depth prediction within a hierarchical framework.
In the proposed plane guided depth generator (PGDG), we design a set of plane queries as prototypes to softly model planes in the scene and predict per-pixel plane coefficients.
In the proposed adaptive plane query aggregation (APGA) module, we introduce a novel feature interaction approach to improve the aggregation of multi-scale plane features.
arXiv Detail & Related papers (2024-09-04T07:45:06Z) - Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR
Point Clouds [29.15589024703907]
In this paper, we revisit the local point aggregators from the perspective of allocating computational resources.
We find that the simplest pillar based models perform surprisingly well considering both accuracy and latency.
Our results challenge the common intuition that the detailed geometry modeling is essential to achieve high performance for 3D object detection.
arXiv Detail & Related papers (2023-05-08T17:59:14Z) - Deep Planar Parallax for Monocular Depth Estimation [24.801102342402828]
In-depth analysis reveals that utilizing flow-pretrain can optimize the network's usage of consecutive frame modeling.
We also propose Planar Position Embedding to handle dynamic objects that defy static scene assumptions.
arXiv Detail & Related papers (2023-01-09T06:02:36Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion
Network [3.1447111126464997]
We propose DenseLiDAR, a novel real-time pseudo-depth guided depth completion neural network.
We exploit dense pseudo-depth map obtained from simple morphological operations to guide the network.
Our model is able to achieve the state-of-the-art performance at the highest frame rate of 50Hz.
arXiv Detail & Related papers (2021-08-28T14:18:29Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - SelfVoxeLO: Self-supervised LiDAR Odometry with Voxel-based Deep Neural
Networks [81.64530401885476]
We propose a self-supervised LiDAR odometry method, dubbed SelfVoxeLO, to tackle these two difficulties.
Specifically, we propose a 3D convolution network to process the raw LiDAR data directly, which extracts features that better encode the 3D geometric patterns.
We evaluate our method's performances on two large-scale datasets, i.e., KITTI and Apollo-SouthBay.
arXiv Detail & Related papers (2020-10-19T09:23:39Z) - Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [36.414471128890284]
We tackle the essential problem of scale inconsistency for self-supervised joint depth-pose learning.
Most existing methods assume that a consistent scale of depth and pose can be learned across all input samples.
We propose a novel system that explicitly disentangles scale from the network estimation.
arXiv Detail & Related papers (2020-04-03T00:28:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.