IDMS: Instance Depth for Multi-scale Monocular 3D Object Detection
- URL: http://arxiv.org/abs/2212.01528v1
- Date: Sat, 3 Dec 2022 04:02:31 GMT
- Title: IDMS: Instance Depth for Multi-scale Monocular 3D Object Detection
- Authors: Chao Hu, Liqiang Zhu, Weibing Qiu, Weijie Wu
- Abstract summary: A multi-scale perception module based on dilated convolution is designed to enhance the model's processing ability for different scale targets.
By verifying the proposed algorithm on the KITTI test set and evaluation set, the experimental results show that compared with the baseline method, the proposed method improves by 5.27% in AP40 in the car category.
- Score: 1.7710335706046505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the lack of depth information of images and poor detection accuracy in
monocular 3D object detection, we proposed the instance depth for multi-scale
monocular 3D object detection method. Firstly, to enhance the model's
processing ability for different scale targets, a multi-scale perception module
based on dilated convolution is designed, and the depth features containing
multi-scale information are re-refined from both spatial and channel directions
considering the inconsistency between feature maps of different scales.
Firstly, we designed a multi-scale perception module based on dilated
convolution to enhance the model's processing ability for different scale
targets. The depth features containing multi-scale information are re-refined
from spatial and channel directions considering the inconsistency between
feature maps of different scales. Secondly, so as to make the model obtain
better 3D perception, this paper proposed to use the instance depth information
as an auxiliary learning task to enhance the spatial depth feature of the 3D
target and use the sparse instance depth to supervise the auxiliary task.
Finally, by verifying the proposed algorithm on the KITTI test set and
evaluation set, the experimental results show that compared with the baseline
method, the proposed method improves by 5.27\% in AP40 in the car category,
effectively improving the detection performance of the monocular 3D object
detection algorithm.
Related papers
- Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - Depth-discriminative Metric Learning for Monocular 3D Object Detection [14.554132525651868]
We introduce a novel metric learning scheme that encourages the model to extract depth-discriminative features regardless of the visual attributes.
Our method consistently improves the performance of various baselines by 23.51% and 5.78% on average.
arXiv Detail & Related papers (2024-01-02T07:34:09Z) - Depth Estimation Matters Most: Improving Per-Object Depth Estimation for
Monocular 3D Detection and Tracking [47.59619420444781]
Approaches to monocular 3D perception including detection and tracking often yield inferior performance when compared to LiDAR-based techniques.
We propose a multi-level fusion method that combines different representations (RGB and pseudo-LiDAR) and temporal information across multiple frames for objects (tracklets) to enhance per-object depth estimation.
arXiv Detail & Related papers (2022-06-08T03:37:59Z) - MDS-Net: A Multi-scale Depth Stratification Based Monocular 3D Object
Detection Algorithm [4.958840734249869]
This paper proposes a one-stage monocular 3D object detection algorithm based on multi-scale depth stratification.
Experiments on the KITTI benchmark show that the MDS-Net outperforms the existing monocular 3D detection methods in 3D detection and BEV detection tasks.
arXiv Detail & Related papers (2022-01-12T07:11:18Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a
Single Image [37.83574424518901]
3D object detection from a single image is an important task in Autonomous Driving.
We propose an instance-aware approach to aggregate useful information for improving the accuracy of 3D object detection.
arXiv Detail & Related papers (2021-03-05T05:47:52Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.