Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception
- URL: http://arxiv.org/abs/2109.05441v1
- Date: Sun, 12 Sep 2021 06:25:11 GMT
- Title: Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception
- Authors: Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Wei Li, Yuexin Ma,
Hongsheng Li, Ruigang Yang, Dahua Lin
- Abstract summary: State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
- Score: 122.53774221136193
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: State-of-the-art methods for driving-scene LiDAR-based perception (including
point cloud semantic segmentation, panoptic segmentation and 3D detection,
\etc) often project the point clouds to 2D space and then process them via 2D
convolution. Although this cooperation shows the competitiveness in the point
cloud, it inevitably alters and abandons the 3D topology and geometric
relations. A natural remedy is to utilize the 3D voxelization and 3D
convolution network. However, we found that in the outdoor point cloud, the
improvement obtained in this way is quite limited. An important reason is the
property of the outdoor point cloud, namely sparsity and varying density.
Motivated by this investigation, we propose a new framework for the outdoor
LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution
networks are designed to explore the 3D geometric pattern while maintaining
these inherent properties. The proposed model acts as a backbone and the
learned features from this model can be used for downstream tasks such as point
cloud semantic and panoptic segmentation or 3D detection. In this paper, we
benchmark our model on these three tasks. For semantic segmentation, we
evaluate the proposed model on several large-scale datasets, \ie,
SemanticKITTI, nuScenes and A2D2. Our method achieves the state-of-the-art on
the leaderboard of SemanticKITTI (both single-scan and multi-scan challenge),
and significantly outperforms existing methods on nuScenes and A2D2 dataset.
Furthermore, the proposed 3D framework also shows strong performance and good
generalization on LiDAR panoptic segmentation and LiDAR 3D detection.
Related papers
- Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point
Clouds [0.16799377888527683]
We present S3CNet, a sparse convolution based neural network that predicts the semantically completed scene from a single, unified LiDAR point cloud.
We show that our proposed method outperforms all counterparts on the 3D task, achieving state-of-the art results on the Semantic KITTI benchmark.
arXiv Detail & Related papers (2020-12-16T20:14:41Z) - Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene
Understanding [19.134536179555102]
We propose an alternative approach to overcome the limitations of CNN based approaches by encoding the spatial features of raw 3D point clouds into undirected graph models.
The proposed method achieves on par state-of-the-art accuracy with improved training time and model stability thus indicating strong potential for further research.
arXiv Detail & Related papers (2020-11-29T12:56:19Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Pointwise Attention-Based Atrous Convolutional Neural Networks [15.499267533387039]
A pointwise attention-based atrous convolutional neural network architecture is proposed to efficiently deal with a large number of points.
The proposed model has been evaluated on the two most important 3D point cloud datasets for the 3D semantic segmentation task.
It achieves a reasonable performance compared to state-of-the-art models in terms of accuracy, with a much smaller number of parameters.
arXiv Detail & Related papers (2019-12-27T13:12:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.