PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
- URL: http://arxiv.org/abs/1912.13192v2
- Date: Fri, 9 Apr 2021 06:37:15 GMT
- Title: PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
- Authors: Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang
Wang, Hongsheng Li
- Abstract summary: We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN)
Our proposed method deeply integrates both 3D voxel Convolutional Neural Network (CNN) and PointNet-based set abstraction.
It takes advantages of efficient learning and high-quality proposals of the 3D voxel CNN and the flexible receptive fields of the PointNet-based networks.
- Score: 76.30585706811993
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a novel and high-performance 3D object detection framework, named
PointVoxel-RCNN (PV-RCNN), for accurate 3D object detection from point clouds.
Our proposed method deeply integrates both 3D voxel Convolutional Neural
Network (CNN) and PointNet-based set abstraction to learn more discriminative
point cloud features. It takes advantages of efficient learning and
high-quality proposals of the 3D voxel CNN and the flexible receptive fields of
the PointNet-based networks. Specifically, the proposed framework summarizes
the 3D scene with a 3D voxel CNN into a small set of keypoints via a novel
voxel set abstraction module to save follow-up computations and also to encode
representative scene features. Given the high-quality 3D proposals generated by
the voxel CNN, the RoI-grid pooling is proposed to abstract proposal-specific
features from the keypoints to the RoI-grid points via keypoint set abstraction
with multiple receptive fields. Compared with conventional pooling operations,
the RoI-grid feature points encode much richer context information for
accurately estimating object confidences and locations. Extensive experiments
on both the KITTI dataset and the Waymo Open dataset show that our proposed
PV-RCNN surpasses state-of-the-art 3D detection methods with remarkable margins
by using only point clouds. Code is available at
https://github.com/open-mmlab/OpenPCDet.
Related papers
- Pillar R-CNN for Point Cloud 3D Object Detection [4.169126928311421]
We devise a conceptually simple yet effective two-stage 3D detection architecture, named Pillar R-CNN.
Our Pillar R-CNN performs favorably against state-of-the-art 3D detectors on the large-scale Open dataset.
It should be highlighted that further exploration into BEV perception for applications involving autonomous driving is now possible thanks to the effective and elegant Pillar R-CNN architecture.
arXiv Detail & Related papers (2023-02-26T12:07:25Z) - From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with
Voxel-to-Point Decoder [79.39041453836793]
We present an Intersection-over-Union (IoU) guided two-stage 3D object detector with a voxel-to-point decoder.
We propose a residual voxel-to-point decoder to extract the point features in addition to the map-view features from the voxel based Region Proposal Network (RPN)
We propose a simple and efficient method to align the estimated IoUs to the refined proposal boxes as a more relevant localization confidence.
arXiv Detail & Related papers (2021-08-08T14:30:13Z) - From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object
Detection [101.20784125067559]
We propose a new architecture, namely Hallucinated Hollow-3D R-CNN, to address the problem of 3D object detection.
In our approach, we first extract the multi-view features by sequentially projecting the point clouds into the perspective view and the bird-eye view.
The 3D objects are detected via a box refinement module with a novel Hierarchical Voxel RoI Pooling operation.
arXiv Detail & Related papers (2021-07-30T02:00:06Z) - PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector
Representation for 3D Object Detection [100.60209139039472]
We propose the PointVoxel Region based Convolution Neural Networks (PVRCNNs) for accurate 3D detection from point clouds.
Our proposed PV-RCNNs significantly outperform previous state-of-the-art 3D detection methods on both the Open dataset and the highly-competitive KITTI benchmark.
arXiv Detail & Related papers (2021-01-31T14:51:49Z) - Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection [99.16162624992424]
We devise a simple but effective voxel-based framework, named Voxel R-CNN.
By taking full advantage of voxel features in a two stage approach, our method achieves comparable detection accuracy with state-of-the-art point-based models.
Our results show that Voxel R-CNN delivers a higher detection accuracy while maintaining a realtime frame processing rate, emphi.e, at a speed of 25 FPS on an NVIDIA 2080 Ti GPU.
arXiv Detail & Related papers (2020-12-31T17:02:46Z) - Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid.
We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently.
We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - Pointwise Attention-Based Atrous Convolutional Neural Networks [15.499267533387039]
A pointwise attention-based atrous convolutional neural network architecture is proposed to efficiently deal with a large number of points.
The proposed model has been evaluated on the two most important 3D point cloud datasets for the 3D semantic segmentation task.
It achieves a reasonable performance compared to state-of-the-art models in terms of accuracy, with a much smaller number of parameters.
arXiv Detail & Related papers (2019-12-27T13:12:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.