Probabilistic and Geometric Depth: Detecting Objects in Perspective
- URL: http://arxiv.org/abs/2107.14160v1
- Date: Thu, 29 Jul 2021 16:30:33 GMT
- Title: Probabilistic and Geometric Depth: Detecting Objects in Perspective
- Authors: Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
- Abstract summary: 3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
- Score: 78.00922683083776
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: 3D object detection is an important capability needed in various practical
applications such as driver assistance systems. Monocular 3D detection, as an
economical solution compared to conventional settings relying on binocular
vision or LiDAR, has drawn increasing attention recently but still yields
unsatisfactory results. This paper first presents a systematic study on this
problem and observes that the current monocular 3D detection problem can be
simplified as an instance depth estimation problem: The inaccurate instance
depth blocks all the other 3D attribute predictions from improving the overall
detection performance. However, recent methods directly estimate the depth
based on isolated instances or pixels while ignoring the geometric relations
across different objects, which can be valuable constraints as the key
information about depth is not directly manifest in the monocular image.
Therefore, we construct geometric relation graphs across predicted objects and
use the graph to facilitate depth estimation. As the preliminary depth
estimation of each instance is usually inaccurate in this ill-posed setting, we
incorporate a probabilistic representation to capture the uncertainty. It
provides an important indicator to identify confident predictions and further
guide the depth propagation. Despite the simplicity of the basic idea, our
method obtains significant improvements on KITTI and nuScenes benchmarks,
achieving the 1st place out of all monocular vision-only methods while still
maintaining real-time efficiency. Code and models will be released at
https://github.com/open-mmlab/mmdetection3d.
Related papers
- GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D
Object Detection [95.8940731298518]
We propose a novel Geometry Uncertainty Propagation Network (GUPNet++)
It models the uncertainty propagation relationship of the geometry projection during training, improving the stability and efficiency of the end-to-end model learning.
Experiments show that the proposed approach not only obtains (state-of-the-art) SOTA performance in image-based monocular 3D detection but also demonstrates superiority in efficacy with a simplified framework.
arXiv Detail & Related papers (2023-10-24T08:45:15Z) - Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR [9.361704310981196]
We propose a method for predicting absolute depth and detecting 3D objects using only monocular image sequences.
As a result, the proposed method surpasses other existing methods in performance on the KITTI 3D dataset.
arXiv Detail & Related papers (2022-09-20T05:55:49Z) - Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection.
Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon.
Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Geometry-aware data augmentation for monocular 3D object detection [18.67567745336633]
This paper focuses on monocular 3D object detection, one of the essential modules in autonomous driving systems.
A key challenge is that the depth recovery problem is ill-posed in monocular data.
We conduct a thorough analysis to reveal how existing methods fail to robustly estimate depth when different geometry shifts occur.
We convert the aforementioned manipulations into four corresponding 3D-aware data augmentation techniques.
arXiv Detail & Related papers (2021-04-12T23:12:48Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Categorical Depth Distribution Network for Monocular 3D Object Detection [7.0405916639906785]
Key challenge in monocular 3D detection is accurately predicting object depth.
Many methods attempt to directly estimate depth to assist in 3D detection, but show limited performance as a result of depth inaccuracy.
We propose Categorical Depth Distribution Network (CaDDN) to project rich contextual feature information to the appropriate depth interval in 3D space.
We validate our approach on the KITTI 3D object detection benchmark, where we rank 1st among published monocular methods.
arXiv Detail & Related papers (2021-03-01T16:08:29Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.