Related papers: Boundary-Aware 3D Object Detection from Point Clouds

Boundary-Aware 3D Object Detection from Point Clouds

URL: http://arxiv.org/abs/2104.10330v1
Date: Wed, 21 Apr 2021 03:10:33 GMT
Title: Boundary-Aware 3D Object Detection from Point Clouds
Authors: Rui Qian, Xin Lai, Xirong Li
Abstract summary: We propose BANet for 3D object detection from point clouds. We represent each proposal as a node for graph construction within a given cut-off threshold. Our BANet achieves on par performance on KITTI 3D detection leaderboard.
Score: 14.772968858398043
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Currently, existing state-of-the-art 3D object detectors are in two-stage paradigm. These methods typically comprise two steps: 1) Utilize region proposal network to propose a fraction of high-quality proposals in a bottom-up fashion. 2) Resize and pool the semantic features from the proposed regions to summarize RoI-wise representations for further refinement. Note that these RoI-wise representations in step 2) are considered individually as an uncorrelated entry when fed to following detection headers. Nevertheless, we observe these proposals generated by step 1) offset from ground truth somehow, emerging in local neighborhood densely with an underlying probability. Challenges arise in the case where a proposal largely forsakes its boundary information due to coordinate offset while existing networks lack corresponding information compensation mechanism. In this paper, we propose BANet for 3D object detection from point clouds. Specifically, instead of refining each proposal independently as previous works do, we represent each proposal as a node for graph construction within a given cut-off threshold, associating proposals in the form of local neighborhood graph, with boundary correlations of an object being explicitly exploited. Besiedes, we devise a lightweight Region Feature Aggregation Network to fully exploit voxel-wise, pixel-wise, and point-wise feature with expanding receptive fields for more informative RoI-wise representations. As of Apr. 17th, 2021, our BANet achieves on par performance on KITTI 3D detection leaderboard and ranks $1^{st}$ on $Moderate$ difficulty of $Car$ category on KITTI BEV detection leaderboard. The source code will be released once the paper is accepted.

Related papers

PG-RCNN: Semantic Surface Point Generation for 3D Object Detection [19.341260543105548]
Point Generation R-CNN (PG-RCNN) is a novel end-to-end detector for 3D object detection. Uses a jointly trained RoI point generation module to process contextual information of RoIs. For every generated point, PG-RCNN assigns a semantic feature that indicates the estimated foreground probability.
arXiv Detail & Related papers (2023-07-24T09:22:09Z)
A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation [12.499361832561634]
We present a unified bird's-eye view (BEV) model for jointly learning of 3D local features and overlap estimation. Our method significantly outperforms existing methods on overlap prediction, especially in scenes with small overlaps.
arXiv Detail & Related papers (2023-02-28T12:01:16Z)
Exploring Active 3D Object Detection from a Generalization Perspective [58.597942380989245]
Uncertainty-based active learning policies fail to balance the trade-off between point cloud informativeness and box-level annotation costs. We propose textscCrb, which hierarchically filters out the point clouds of redundant 3D bounding box labels. Experiments show that the proposed approach outperforms existing active learning strategies.
arXiv Detail & Related papers (2023-01-23T02:43:03Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D. Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels. To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z)
ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection [114.54835359657707]
ProposalContrast is an unsupervised point cloud pre-training framework. It learns robust 3D representations by contrasting region proposals. ProposalContrast is verified on various 3D detectors.
arXiv Detail & Related papers (2022-07-26T04:45:49Z)
3D Object Detection Combining Semantic and Geometric Features from Point Clouds [19.127930862527666]
We propose a novel end-to-end two-stage 3D object detector named SGNet for point clouds scenes. The VTPM is a Voxel-Point-Based Module that finally implements 3D object detection in point space. As of September 19, 2021, for KITTI dataset, SGNet ranked 1st in 3D and BEV detection on cyclists with easy difficulty level, and 2nd in the 3D detection of moderate cyclists.
arXiv Detail & Related papers (2021-10-10T04:43:27Z)
From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to-Point Decoder [79.39041453836793]
We present an Intersection-over-Union (IoU) guided two-stage 3D object detector with a voxel-to-point decoder. We propose a residual voxel-to-point decoder to extract the point features in addition to the map-view features from the voxel based Region Proposal Network (RPN) We propose a simple and efficient method to align the estimated IoUs to the refined proposal boxes as a more relevant localization confidence.
arXiv Detail & Related papers (2021-08-08T14:30:13Z)
Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection. The whole architecture facilitates two-stage fusion. Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.