Point Density-Aware Voxels for LiDAR 3D Object Detection
- URL: http://arxiv.org/abs/2203.05662v1
- Date: Thu, 10 Mar 2022 22:11:06 GMT
- Title: Point Density-Aware Voxels for LiDAR 3D Object Detection
- Authors: Jordan S. K. Hu, Tianshu Kuai, Steven L. Waslander
- Abstract summary: Point Density-Aware Voxel network (PDV) is an end-to-end two stage LiDAR 3D object detection architecture.
PDV efficiently localizes voxel features from the 3D sparse backbone through voxel point centroids.
PDV outperforms all state-of-the-art methods on the Open dataset.
- Score: 8.136649838488042
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR has become one of the primary 3D object detection sensors in autonomous
driving. However, LiDAR's diverging point pattern with increasing distance
results in a non-uniform sampled point cloud ill-suited to discretized
volumetric feature extraction. Current methods either rely on voxelized point
clouds or use inefficient farthest point sampling to mitigate detrimental
effects caused by density variation but largely ignore point density as a
feature and its predictable relationship with distance from the LiDAR sensor.
Our proposed solution, Point Density-Aware Voxel network (PDV), is an
end-to-end two stage LiDAR 3D object detection architecture that is designed to
account for these point density variations. PDV efficiently localizes voxel
features from the 3D sparse convolution backbone through voxel point centroids.
The spatially localized voxel features are then aggregated through a
density-aware RoI grid pooling module using kernel density estimation (KDE) and
self-attention with point density positional encoding. Finally, we exploit
LiDAR's point density to distance relationship to refine our final bounding box
confidences. PDV outperforms all state-of-the-art methods on the Waymo Open
Dataset and achieves competitive results on the KITTI dataset. We provide a
code release for PDV which is available at https://github.com/TRAILab/PDV.
Related papers
- Sparse-to-Dense LiDAR Point Generation by LiDAR-Camera Fusion for 3D Object Detection [9.076003184833557]
We propose the LiDAR-Camera Augmentation Network (LCANet), a novel framework that reconstructs LiDAR point cloud data by fusing 2D image features.
LCANet fuses data from LiDAR sensors by projecting image features into the 3D space, integrating semantic information into the point cloud data.
This fusion effectively compensates for LiDAR's weakness in detecting objects at long distances, which are often represented by sparse points.
arXiv Detail & Related papers (2024-09-23T13:03:31Z) - PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction [72.75478398447396]
We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively.
Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system.
We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
arXiv Detail & Related papers (2023-08-31T17:57:17Z) - Semantic Segmentation on 3D Point Clouds with High Density Variations [44.467561618769714]
HDVNet contains a nested set of encoder-decoder pathways, each handling a specific point density range.
By effectively handling input density variations, HDVNet outperforms state-of-the-art models in segmentation accuracy on real point clouds with inconsistent density.
arXiv Detail & Related papers (2023-07-04T05:44:13Z) - Improving LiDAR 3D Object Detection via Range-based Point Cloud Density
Optimization [13.727464375608765]
Existing 3D object detectors tend to perform well on the point cloud regions closer to the LiDAR sensor as opposed to on regions that are farther away.
We observe that there is a learning bias in detection models towards the dense objects near the sensor and show that the detection performance can be improved by simply manipulating the input point cloud density at different distance ranges.
arXiv Detail & Related papers (2023-06-09T04:11:43Z) - Voxel or Pillar: Exploring Efficient Point Cloud Representation for 3D
Object Detection [49.324070632356296]
We develop a sparse voxel-pillar encoder that encodes point clouds into voxel and pillar features through 3D and 2D sparse convolutions respectively.
Our efficient, fully sparse method can be seamlessly integrated into both dense and sparse detectors.
arXiv Detail & Related papers (2023-04-06T05:00:58Z) - Sparse2Dense: Learning to Densify 3D Features for 3D Object Detection [85.08249413137558]
LiDAR-produced point clouds are the major source for most state-of-the-art 3D object detectors.
Small, distant, and incomplete objects with sparse or few points are often hard to detect.
We present Sparse2Dense, a new framework to efficiently boost 3D detection performance by learning to densify point clouds in latent space.
arXiv Detail & Related papers (2022-11-23T16:01:06Z) - On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation [56.97699793236174]
We study two kinds of robust cross-view consistency in this paper.
We exploit the temporal coherence in both depth feature space and 3D voxel space for self-supervised monocular depth estimation.
Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-19T03:46:13Z) - Fully Sparse 3D Object Detection [57.05834683261658]
We build a fully sparse 3D object detector (FSD) for long-range LiDAR-based object detection.
FSD is built upon the general sparse voxel encoder and a novel sparse instance recognition (SIR) module.
SIR avoids the time-consuming neighbor queries in previous point-based methods by grouping points into instances.
arXiv Detail & Related papers (2022-07-20T17:01:33Z) - RAANet: Range-Aware Attention Network for LiDAR-based 3D Object
Detection with Auxiliary Density Level Estimation [11.180128679075716]
Range-Aware Attention Network (RAANet) is developed for 3D object detection from LiDAR data for autonomous driving.
RAANet extracts more powerful BEV features and generates superior 3D object detections.
Experiments on nuScenes dataset demonstrate that our proposed approach outperforms the state-of-the-art methods for LiDAR-based 3D object detection.
arXiv Detail & Related papers (2021-11-18T04:20:13Z) - DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic
Voxelization [0.0]
We propose a novel two-stage framework for the efficient 3D point cloud object detection.
We parse the raw point cloud data directly in the 3D space yet achieve impressive efficiency and accuracy.
We highlight our KITTI 3D object detection dataset with 75 FPS and on Open dataset with 25 FPS inference speed with satisfactory accuracy.
arXiv Detail & Related papers (2021-07-27T10:07:39Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.