3D Object Detection Combining Semantic and Geometric Features from Point
Clouds
- URL: http://arxiv.org/abs/2110.04704v1
- Date: Sun, 10 Oct 2021 04:43:27 GMT
- Title: 3D Object Detection Combining Semantic and Geometric Features from Point
Clouds
- Authors: Hao Peng, Guofeng Tong, Zheng Li, Yaqi Wang, Yuyuan Shao
- Abstract summary: We propose a novel end-to-end two-stage 3D object detector named SGNet for point clouds scenes.
The VTPM is a Voxel-Point-Based Module that finally implements 3D object detection in point space.
As of September 19, 2021, for KITTI dataset, SGNet ranked 1st in 3D and BEV detection on cyclists with easy difficulty level, and 2nd in the 3D detection of moderate cyclists.
- Score: 19.127930862527666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the combination of voxel-based methods and
point-based methods, and propose a novel end-to-end two-stage 3D object
detector named SGNet for point clouds scenes. The voxel-based methods voxelize
the scene to regular grids, which can be processed with the current advanced
feature learning frameworks based on convolutional layers for semantic feature
learning. Whereas the point-based methods can better extract the geometric
feature of the point due to the coordinate reservations. The combination of the
two is an effective solution for 3D object detection from point clouds.
However, most current methods use a voxel-based detection head with anchors for
final classification and localization. Although the preset anchors cover the
entire scene, it is not suitable for point clouds detection tasks with larger
scenes and multiple categories due to the limitation of voxel size. In this
paper, we propose a voxel-to-point module (VTPM) that captures semantic and
geometric features. The VTPM is a Voxel-Point-Based Module that finally
implements 3D object detection in point space, which is more conducive to the
detection of small-size objects and avoids the presets of anchors in inference
stage. In addition, a Confidence Adjustment Module (CAM) with the
center-boundary-aware confidence attention is proposed to solve the
misalignment between the predicted confidence and proposals in the regions of
the interest (RoI) selection. The SGNet proposed in this paper has achieved
state-of-the-art results for 3D object detection in the KITTI dataset,
especially in the detection of small-size objects such as cyclists. Actually,
as of September 19, 2021, for KITTI dataset, SGNet ranked 1st in 3D and BEV
detection on cyclists with easy difficulty level, and 2nd in the 3D detection
of moderate cyclists.
Related papers
- 3D Cascade RCNN: High Quality Object Detection in Point Clouds [122.42455210196262]
We present 3D Cascade RCNN, which allocates multiple detectors based on the voxelized point clouds in a cascade paradigm.
We validate the superiority of our proposed 3D Cascade RCNN, when comparing to state-of-the-art 3D object detection techniques.
arXiv Detail & Related papers (2022-11-15T15:58:36Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - RBGNet: Ray-based Grouping for 3D Object Detection [104.98776095895641]
We propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds.
We propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays.
Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains.
arXiv Detail & Related papers (2022-04-05T14:42:57Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - Anchor-free 3D Single Stage Detector with Mask-Guided Attention for
Point Cloud [79.39041453836793]
We develop a novel single-stage 3D detector for point clouds in an anchor-free manner.
We overcome this by converting the voxel-based sparse 3D feature volumes into the sparse 2D feature maps.
We propose an IoU-based detection confidence re-calibration scheme to improve the correlation between the detection confidence score and the accuracy of the bounding box regression.
arXiv Detail & Related papers (2021-08-08T13:42:13Z) - InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic
Information Modeling [65.47126868838836]
We propose a novel 3D object detection framework with dynamic information modeling.
Coarse predictions are generated in the first stage via a voxel-based region proposal network.
Experiments are conducted on the large-scale nuScenes 3D detection benchmark.
arXiv Detail & Related papers (2020-07-16T18:27:08Z) - 1st Place Solution for Waymo Open Dataset Challenge -- 3D Detection and
Domain Adaptation [7.807118356899879]
We propose a one-stage, anchor-free and NMS-free 3D point cloud object detector AFDet.
AFDet serves as a strong baseline in our winning solution.
We design stronger networks and enhance the point cloud data using densification and point painting.
arXiv Detail & Related papers (2020-06-28T04:49:39Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - Object as Hotspots: An Anchor-Free 3D Object Detection Approach via
Firing of Hotspots [37.16690737208046]
We argue for an approach opposite to existing methods using object-level anchors.
Inspired by compositional models, we propose an object as composition of its interior non-empty voxels, termed hotspots.
Based on OHS, we propose an anchor-free detection head with a novel ground truth assignment strategy.
arXiv Detail & Related papers (2019-12-30T03:02:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.