Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation
- URL: http://arxiv.org/abs/2311.17491v2
- Date: Mon, 28 Oct 2024 13:54:38 GMT
- Title: Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation
- Authors: Yu Zheng, Guangming Wang, Jiuming Liu, Marc Pollefeys, Hesheng Wang,
- Abstract summary: LiDAR point cloud semantic segmentation enables the robots to obtain fine-grained semantic information of the surrounding environment.
Many works project the point cloud onto the 2D image and adopt the 2D Convolutional Neural Networks (CNNs) or vision transformer for LiDAR point cloud semantic segmentation.
In this paper, we propose a novel spherical frustum structure to avoid quantized information loss.
- Score: 62.258256483231484
- License:
- Abstract: LiDAR point cloud semantic segmentation enables the robots to obtain fine-grained semantic information of the surrounding environment. Recently, many works project the point cloud onto the 2D image and adopt the 2D Convolutional Neural Networks (CNNs) or vision transformer for LiDAR point cloud semantic segmentation. However, since more than one point can be projected onto the same 2D position but only one point can be preserved, the previous 2D image-based segmentation methods suffer from inevitable quantized information loss. To avoid quantized information loss, in this paper, we propose a novel spherical frustum structure. The points projected onto the same 2D position are preserved in the spherical frustums. Moreover, we propose a memory-efficient hash-based representation of spherical frustums. Through the hash-based representation, we propose the Spherical Frustum sparse Convolution (SFC) and Frustum Fast Point Sampling (F2PS) to convolve and sample the points stored in spherical frustums respectively. Finally, we present the Spherical Frustum sparse Convolution Network (SFCNet) to adopt 2D CNNs for LiDAR point cloud semantic segmentation without quantized information loss. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that our SFCNet outperforms the 2D image-based semantic segmentation methods based on conventional spherical projection. Codes will be available at https://github.com/IRMVLab/SFCNet.
Related papers
- TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point
Cloud Semantic Segmentation [6.587305905804226]
We propose a transformer-based uncertain point refiner, i.e., TransUPR, to refine selected uncertain points in a learnable manner.
Our TransUPR achieves state-of-the-art performance, i.e., 68.2% mean Intersection over Union (mIoU) on the Semantic KITTI benchmark.
arXiv Detail & Related papers (2023-02-16T21:38:36Z) - Point Cloud Semantic Segmentation using Multi Scale Sparse Convolution
Neural Network [0.0]
We propose a feature extraction module based on multi-scale ultra-sparse convolution and a feature selection module based on channel attention.
By introducing multi-scale sparse convolution, network could capture richer feature information based on convolution kernels of different sizes.
arXiv Detail & Related papers (2022-05-03T15:01:20Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - LatticeNet: Fast Spatio-Temporal Point Cloud Segmentation Using
Permutohedral Lattices [27.048998326468688]
Deep convolutional neural networks (CNNs) have shown outstanding performance in the task of semantically segmenting images.
Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes raw point clouds as input.
We present results of 3D segmentation on multiple datasets where our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-08-09T10:17:27Z) - EllipsoidNet: Ellipsoid Representation for Point Cloud Classification
and Segmentation [24.36469430784483]
Point cloud representation in 2D space has attracted increasing research interest since it exposes the local geometry features in a 2D space.
We propose a novel 2D representation method that projects a point cloud onto an ellipsoid surface space, where local patterns are well exposed in ellipsoid-level and point-level.
A novel convolutional neural network named EllipsoidNet is proposed to utilize those features for point cloud classification and segmentation applications.
arXiv Detail & Related papers (2021-03-03T16:43:08Z) - ParaNet: Deep Regular Representation for 3D Point Clouds [62.81379889095186]
ParaNet is a novel end-to-end deep learning framework for representing 3D point clouds.
It converts an irregular 3D point cloud into a regular 2D color image, named point geometry image (PGI)
In contrast to conventional regular representation modalities based on multi-view projection and voxelization, the proposed representation is differentiable and reversible.
arXiv Detail & Related papers (2020-12-05T13:19:55Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z) - GRNet: Gridding Residual Network for Dense Point Cloud Completion [54.43648460932248]
Estimating the complete 3D point cloud from an incomplete one is a key problem in many vision and robotics applications.
We propose a novel Gridding Residual Network (GRNet) for point cloud completion.
Experimental results indicate that the proposed GRNet performs favorably against state-of-the-art methods on the ShapeNet, Completion3D, and KITTI benchmarks.
arXiv Detail & Related papers (2020-06-06T02:46:39Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.