PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction
- URL: http://arxiv.org/abs/2308.16896v1
- Date: Thu, 31 Aug 2023 17:57:17 GMT
- Title: PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction
- Authors: Sicheng Zuo, Wenzhao Zheng, Yuanhui Huang, Jie Zhou, Jiwen Lu
- Abstract summary: We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively.
Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system.
We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
- Score: 72.75478398447396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation in autonomous driving has been undergoing an evolution
from sparse point segmentation to dense voxel segmentation, where the objective
is to predict the semantic occupancy of each voxel in the concerned 3D space.
The dense nature of the prediction space has rendered existing efficient
2D-projection-based methods (e.g., bird's eye view, range view, etc.)
ineffective, as they can only describe a subspace of the 3D scene. To address
this, we propose a cylindrical tri-perspective view to represent point clouds
effectively and comprehensively and a PointOcc model to process them
efficiently. Considering the distance distribution of LiDAR point clouds, we
construct the tri-perspective view in the cylindrical coordinate system for
more fine-grained modeling of nearer areas. We employ spatial group pooling to
maintain structural details during projection and adopt 2D backbones to
efficiently process each TPV plane. Finally, we obtain the features of each
point by aggregating its projected features on each of the processed TPV planes
without the need for any post-processing. Extensive experiments on both 3D
occupancy prediction and LiDAR segmentation benchmarks demonstrate that the
proposed PointOcc achieves state-of-the-art performance with much faster speed.
Specifically, despite only using LiDAR, PointOcc significantly outperforms all
other methods, including multi-modal methods, with a large margin on the
OpenOccupancy benchmark. Code: https://github.com/wzzheng/PointOcc.
Related papers
- Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR
Point Clouds [2.924868086534434]
This paper introduces a novel approach for 3D point cloud semantic segmentation that exploits multiple projections of the point cloud.
Our Multi-Projection Fusion framework analyzes spherical and bird's-eye view projections using two separate highly-efficient 2D fully convolutional models.
arXiv Detail & Related papers (2020-11-03T19:40:43Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Leveraging Planar Regularities for Point Line Visual-Inertial Odometry [13.51108336267342]
With monocular Visual-Inertial Odometry (VIO) system, 3D point cloud and camera motion can be estimated simultaneously.
We propose PLP-VIO, which exploits point features and line features as well as plane regularities.
The effectiveness of the proposed method is verified on both synthetic data and public datasets.
arXiv Detail & Related papers (2020-04-16T18:20:00Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z) - OccuSeg: Occupancy-aware 3D Instance Segmentation [39.71517989569514]
"3D occupancy size" is the number of voxels occupied by each instance.
"OccuSeg" is an occupancy-aware 3D instance segmentation scheme.
"State-of-the-art performance" on 3 real-world datasets.
arXiv Detail & Related papers (2020-03-14T02:48:55Z) - Pointwise Attention-Based Atrous Convolutional Neural Networks [15.499267533387039]
A pointwise attention-based atrous convolutional neural network architecture is proposed to efficiently deal with a large number of points.
The proposed model has been evaluated on the two most important 3D point cloud datasets for the 3D semantic segmentation task.
It achieves a reasonable performance compared to state-of-the-art models in terms of accuracy, with a much smaller number of parameters.
arXiv Detail & Related papers (2019-12-27T13:12:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.