Depth-conditioned Dynamic Message Propagation for Monocular 3D Object
Detection
- URL: http://arxiv.org/abs/2103.16470v1
- Date: Tue, 30 Mar 2021 16:20:24 GMT
- Title: Depth-conditioned Dynamic Message Propagation for Monocular 3D Object
Detection
- Authors: Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, Xiangyang Xue,
Jianfeng Feng, Li Zhang
- Abstract summary: We learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.
We show state-of-the-art results among the monocular-based approaches on the KITTI benchmark dataset.
- Score: 86.25022248968908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The objective of this paper is to learn context- and depth-aware feature
representation to solve the problem of monocular 3D object detection. We make
following contributions: (i) rather than appealing to the complicated
pseudo-LiDAR based approach, we propose a depth-conditioned dynamic message
propagation (DDMP) network to effectively integrate the multi-scale depth
information with the image context;(ii) this is achieved by first adaptively
sampling context-aware nodes in the image context and then dynamically
predicting hybrid depth-dependent filter weights and affinity matrices for
propagating information; (iii) by augmenting a center-aware depth encoding
(CDE) task, our method successfully alleviates the inaccurate depth prior; (iv)
we thoroughly demonstrate the effectiveness of our proposed approach and show
state-of-the-art results among the monocular-based approaches on the KITTI
benchmark dataset. Particularly, we rank $1^{st}$ in the highly competitive
KITTI monocular 3D object detection track on the submission day (November 16th,
2020). Code and models are released at \url{https://github.com/fudan-zvg/DDMP}
Related papers
- PMPNet: Pixel Movement Prediction Network for Monocular Depth Estimation in Dynamic Scenes [7.736445799116692]
We propose a novel method for monocular depth estimation in dynamic scenes.
We first explore the arbitrariness of object's movement trajectory in dynamic scenes theoretically.
To overcome the depth inconsistency problem around the edges, we propose a deformable support window module.
arXiv Detail & Related papers (2024-11-04T03:42:29Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - Depth Estimation Matters Most: Improving Per-Object Depth Estimation for
Monocular 3D Detection and Tracking [47.59619420444781]
Approaches to monocular 3D perception including detection and tracking often yield inferior performance when compared to LiDAR-based techniques.
We propose a multi-level fusion method that combines different representations (RGB and pseudo-LiDAR) and temporal information across multiple frames for objects (tracklets) to enhance per-object depth estimation.
arXiv Detail & Related papers (2022-06-08T03:37:59Z) - Depth-Cooperated Trimodal Network for Video Salient Object Detection [13.727763221832532]
We propose a depth-operated triOD network called DCTNet for video salient object detection (VS)
To this end, we first generate depth from RGB frames, and then propose an approach to treat the three modalities unequally.
We also introduce a refinement fusion module (RFM) to suppress noises in each modality and select useful information dynamically for further feature refinement.
arXiv Detail & Related papers (2022-02-12T13:04:16Z) - MDS-Net: A Multi-scale Depth Stratification Based Monocular 3D Object
Detection Algorithm [4.958840734249869]
This paper proposes a one-stage monocular 3D object detection algorithm based on multi-scale depth stratification.
Experiments on the KITTI benchmark show that the MDS-Net outperforms the existing monocular 3D detection methods in 3D detection and BEV detection tasks.
arXiv Detail & Related papers (2022-01-12T07:11:18Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - Monocular 3D Object Detection with Sequential Feature Association and
Depth Hint Augmentation [12.55603878441083]
FADNet is presented to address the task of monocular 3D object detection.
A dedicated depth hint module is designed to generate row-wise features named as depth hints.
The contributions of this work are validated by conducting experiments and ablation study on the KITTI benchmark.
arXiv Detail & Related papers (2020-11-30T07:19:14Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.