Boundary-Aware 3D Object Detection from Point Clouds
- URL: http://arxiv.org/abs/2104.10330v1
- Date: Wed, 21 Apr 2021 03:10:33 GMT
- Title: Boundary-Aware 3D Object Detection from Point Clouds
- Authors: Rui Qian, Xin Lai, Xirong Li
- Abstract summary: We propose BANet for 3D object detection from point clouds.
We represent each proposal as a node for graph construction within a given cut-off threshold.
Our BANet achieves on par performance on KITTI 3D detection leaderboard.
- Score: 14.772968858398043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Currently, existing state-of-the-art 3D object detectors are in two-stage
paradigm. These methods typically comprise two steps: 1) Utilize region
proposal network to propose a fraction of high-quality proposals in a bottom-up
fashion. 2) Resize and pool the semantic features from the proposed regions to
summarize RoI-wise representations for further refinement. Note that these
RoI-wise representations in step 2) are considered individually as an
uncorrelated entry when fed to following detection headers. Nevertheless, we
observe these proposals generated by step 1) offset from ground truth somehow,
emerging in local neighborhood densely with an underlying probability.
Challenges arise in the case where a proposal largely forsakes its boundary
information due to coordinate offset while existing networks lack corresponding
information compensation mechanism. In this paper, we propose BANet for 3D
object detection from point clouds. Specifically, instead of refining each
proposal independently as previous works do, we represent each proposal as a
node for graph construction within a given cut-off threshold, associating
proposals in the form of local neighborhood graph, with boundary correlations
of an object being explicitly exploited. Besiedes, we devise a lightweight
Region Feature Aggregation Network to fully exploit voxel-wise, pixel-wise, and
point-wise feature with expanding receptive fields for more informative
RoI-wise representations. As of Apr. 17th, 2021, our BANet achieves on par
performance on KITTI 3D detection leaderboard and ranks $1^{st}$ on $Moderate$
difficulty of $Car$ category on KITTI BEV detection leaderboard. The source
code will be released once the paper is accepted.
Related papers
- PG-RCNN: Semantic Surface Point Generation for 3D Object Detection [19.341260543105548]
Point Generation R-CNN (PG-RCNN) is a novel end-to-end detector for 3D object detection.
Uses a jointly trained RoI point generation module to process contextual information of RoIs.
For every generated point, PG-RCNN assigns a semantic feature that indicates the estimated foreground probability.
arXiv Detail & Related papers (2023-07-24T09:22:09Z) - A Unified BEV Model for Joint Learning of 3D Local Features and Overlap
Estimation [12.499361832561634]
We present a unified bird's-eye view (BEV) model for jointly learning of 3D local features and overlap estimation.
Our method significantly outperforms existing methods on overlap prediction, especially in scenes with small overlaps.
arXiv Detail & Related papers (2023-02-28T12:01:16Z) - Exploring Active 3D Object Detection from a Generalization Perspective [58.597942380989245]
Uncertainty-based active learning policies fail to balance the trade-off between point cloud informativeness and box-level annotation costs.
We propose textscCrb, which hierarchically filters out the point clouds of redundant 3D bounding box labels.
Experiments show that the proposed approach outperforms existing active learning strategies.
arXiv Detail & Related papers (2023-01-23T02:43:03Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object
Detection [114.54835359657707]
ProposalContrast is an unsupervised point cloud pre-training framework.
It learns robust 3D representations by contrasting region proposals.
ProposalContrast is verified on various 3D detectors.
arXiv Detail & Related papers (2022-07-26T04:45:49Z) - 3D Object Detection Combining Semantic and Geometric Features from Point
Clouds [19.127930862527666]
We propose a novel end-to-end two-stage 3D object detector named SGNet for point clouds scenes.
The VTPM is a Voxel-Point-Based Module that finally implements 3D object detection in point space.
As of September 19, 2021, for KITTI dataset, SGNet ranked 1st in 3D and BEV detection on cyclists with easy difficulty level, and 2nd in the 3D detection of moderate cyclists.
arXiv Detail & Related papers (2021-10-10T04:43:27Z) - From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with
Voxel-to-Point Decoder [79.39041453836793]
We present an Intersection-over-Union (IoU) guided two-stage 3D object detector with a voxel-to-point decoder.
We propose a residual voxel-to-point decoder to extract the point features in addition to the map-view features from the voxel based Region Proposal Network (RPN)
We propose a simple and efficient method to align the estimated IoUs to the refined proposal boxes as a more relevant localization confidence.
arXiv Detail & Related papers (2021-08-08T14:30:13Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.