Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular
3D Object Detection
- URL: http://arxiv.org/abs/2205.09373v1
- Date: Thu, 19 May 2022 08:12:55 GMT
- Title: Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular
3D Object Detection
- Authors: Zhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui
Jiang
- Abstract summary: We propose a depth solving system that fully explores the visual clues from the subtasks in monocular 3D images.
Our method surpasses the current best method by more than 20% relatively on the Moderate level of test split in the KITTI 3D object detection benchmark.
- Score: 37.37316176663782
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As an inherently ill-posed problem, depth estimation from single images is
the most challenging part of monocular 3D object detection (M3OD). Many
existing methods rely on preconceived assumptions to bridge the missing spatial
information in monocular images, and predict a sole depth value for every
object of interest. However, these assumptions do not always hold in practical
applications. To tackle this problem, we propose a depth solving system that
fully explores the visual clues from the subtasks in M3OD and generates
multiple estimations for the depth of each target. Since the depth estimations
rely on different assumptions in essence, they present diverse distributions.
Even if some assumptions collapse, the estimations established on the remaining
assumptions are still reliable. In addition, we develop a depth selection and
combination strategy. This strategy is able to remove abnormal estimations
caused by collapsed assumptions, and adaptively combine the remaining
estimations into a single one. In this way, our depth solving system becomes
more precise and robust. Exploiting the clues from multiple subtasks of M3OD
and without introducing any extra information, our method surpasses the current
best method by more than 20% relatively on the Moderate level of test split in
the KITTI 3D object detection benchmark, while still maintaining real-time
efficiency.
Related papers
- MonoCD: Monocular 3D Object Detection with Complementary Depths [9.186673054867866]
Depth estimation is an essential but challenging subtask of monocular 3D object detection.
We propose to increase the complementarity of depths with two novel designs.
Experiments on the KITTI benchmark demonstrate that our method achieves state-of-the-art performance without introducing extra data.
arXiv Detail & Related papers (2024-04-04T03:30:49Z) - Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging
Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework.
We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas.
With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z) - Densely Constrained Depth Estimator for Monocular 3D Object Detection [48.12271792836015]
Estimating accurate 3D locations of objects from monocular images is a challenging problem because of lacking depth.
We propose a method that utilizes dense projection constraints from edges of any direction.
The proposed method achieves state-of-the-art performance on the KITTI and WOD benchmarks.
arXiv Detail & Related papers (2022-07-20T17:24:22Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Geometry Uncertainty Projection Network for Monocular 3D Object
Detection [138.24798140338095]
We propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.
Specifically, a GUP module is proposed to obtains the geometry-guided uncertainty of the inferred depth.
At the training stage, we propose a Hierarchical Task Learning strategy to reduce the instability caused by error amplification.
arXiv Detail & Related papers (2021-07-29T06:59:07Z) - Objects are Different: Flexible Monocular 3D Object Detection [87.82253067302561]
We propose a flexible framework for monocular 3D object detection which explicitly decouples the truncated objects and adaptively combines multiple approaches for object depth estimation.
Experiments demonstrate that our method outperforms the state-of-the-art method by relatively 27% for the moderate level and 30% for the hard level in the test set of KITTI benchmark.
arXiv Detail & Related papers (2021-04-06T07:01:28Z) - Categorical Depth Distribution Network for Monocular 3D Object Detection [7.0405916639906785]
Key challenge in monocular 3D detection is accurately predicting object depth.
Many methods attempt to directly estimate depth to assist in 3D detection, but show limited performance as a result of depth inaccuracy.
We propose Categorical Depth Distribution Network (CaDDN) to project rich contextual feature information to the appropriate depth interval in 3D space.
We validate our approach on the KITTI 3D object detection benchmark, where we rank 1st among published monocular methods.
arXiv Detail & Related papers (2021-03-01T16:08:29Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.