BSH-Det3D: Improving 3D Object Detection with BEV Shape Heatmap
- URL: http://arxiv.org/abs/2303.02000v1
- Date: Fri, 3 Mar 2023 15:13:11 GMT
- Title: BSH-Det3D: Improving 3D Object Detection with BEV Shape Heatmap
- Authors: You Shen, Yunzhou Zhang, Yanmin Wu, Zhenyu Wang, Linghao Yang, Sonya
Coleman, Dermot Kerr
- Abstract summary: We propose a novel LiDAR-based 3D object detection model named BSH-Det3D.
It applies an effective way to enhance spatial features by estimating complete shapes from a bird's eye view.
Experiments on the KITTI benchmark achieve state-of-the-art (SOTA) performance in terms of accuracy and speed.
- Score: 10.060577111347152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The progress of LiDAR-based 3D object detection has significantly enhanced
developments in autonomous driving and robotics. However, due to the
limitations of LiDAR sensors, object shapes suffer from deterioration in
occluded and distant areas, which creates a fundamental challenge to 3D
perception. Existing methods estimate specific 3D shapes and achieve remarkable
performance. However, these methods rely on extensive computation and memory,
causing imbalances between accuracy and real-time performance. To tackle this
challenge, we propose a novel LiDAR-based 3D object detection model named
BSH-Det3D, which applies an effective way to enhance spatial features by
estimating complete shapes from a bird's eye view (BEV). Specifically, we
design the Pillar-based Shape Completion (PSC) module to predict the
probability of occupancy whether a pillar contains object shapes. The PSC
module generates a BEV shape heatmap for each scene. After integrating with
heatmaps, BSH-Det3D can provide additional information in shape deterioration
areas and generate high-quality 3D proposals. We also design an attention-based
densification fusion module (ADF) to adaptively associate the sparse features
with heatmaps and raw points. The ADF module integrates the advantages of
points and shapes knowledge with negligible overheads. Extensive experiments on
the KITTI benchmark achieve state-of-the-art (SOTA) performance in terms of
accuracy and speed, demonstrating the efficiency and flexibility of BSH-Det3D.
The source code is available on https://github.com/mystorm16/BSH-Det3D.
Related papers
- STONE: A Submodular Optimization Framework for Active 3D Object Detection [20.54906045954377]
Key requirement for training an accurate 3D object detector is the availability of a large amount of LiDAR-based point cloud data.
This paper proposes a unified active 3D object detection framework, for greatly reducing the labeling cost of training 3D object detectors.
arXiv Detail & Related papers (2024-10-04T20:45:33Z) - VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics.
In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z) - Robust 3D Tracking with Quality-Aware Shape Completion [67.9748164949519]
We propose a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking.
Specifically, we design a voxelized 3D tracking framework with shape completion, in which we propose a quality-aware shape completion mechanism to alleviate the adverse effect of noisy historical predictions.
arXiv Detail & Related papers (2023-12-17T04:50:24Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy [58.92659367605442]
We present BEV-IO, a new 3D detection paradigm to enhance BEV representation with instance occupancy information.
We show that BEV-IO can outperform state-of-the-art methods while only adding a negligible increase in parameters and computational overhead.
arXiv Detail & Related papers (2023-05-26T11:16:12Z) - RBGNet: Ray-based Grouping for 3D Object Detection [104.98776095895641]
We propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds.
We propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays.
Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains.
arXiv Detail & Related papers (2022-04-05T14:42:57Z) - Improved Pillar with Fine-grained Feature for 3D Object Detection [23.348710029787068]
3D object detection with LiDAR point clouds plays an important role in autonomous driving perception module.
Existing point-based methods are challenging to reach the speed requirements because of too many raw points.
The 2D grid-based methods, such as PointPillar, can easily achieve a stable and efficient speed based on simple 2D convolution.
arXiv Detail & Related papers (2021-10-12T14:53:14Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object
Detection [39.64891219500416]
3D object detection methods exploit either voxel-based or point-based features to represent 3D objects in a scene.
We introduce in this paper a novel single-stage 3D detection method having the merit of both voxel-based and point-based features.
arXiv Detail & Related papers (2021-04-02T06:34:49Z) - Accurate 3D Object Detection using Energy-Based Models [46.05502630457458]
Regressing 3D bounding boxes in cluttered environments based on sparse LiDAR data is a highly challenging problem.
We explore recent advances in conditional energy-based models (EBMs) for probabilistic regression.
Our proposed approach consistently outperforms the SA-SSD baseline across all 3DOD metrics.
arXiv Detail & Related papers (2020-12-08T18:53:42Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.