CPGNet: Cascade Point-Grid Fusion Network for Real-Time LiDAR Semantic
Segmentation
- URL: http://arxiv.org/abs/2204.09914v1
- Date: Thu, 21 Apr 2022 06:56:30 GMT
- Title: CPGNet: Cascade Point-Grid Fusion Network for Real-Time LiDAR Semantic
Segmentation
- Authors: Xiaoyan Li, Gang Zhang, Hongyu Pan, Zhenhua Wang
- Abstract summary: We propose Cascade Point-Grid Fusion Network (CPGNet), which ensures both effectiveness and efficiency.
CPGNet without ensemble models or TTA is comparable with the state-of-the-art RPVNet, while it runs 4.7 times faster.
- Score: 8.944151935020992
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LiDAR semantic segmentation essential for advanced autonomous driving is
required to be accurate, fast, and easy-deployed on mobile platforms. Previous
point-based or sparse voxel-based methods are far away from real-time
applications since time-consuming neighbor searching or sparse 3D convolution
are employed. Recent 2D projection-based methods, including range view and
multi-view fusion, can run in real time, but suffer from lower accuracy due to
information loss during the 2D projection. Besides, to improve the performance,
previous methods usually adopt test time augmentation (TTA), which further
slows down the inference process. To achieve a better speed-accuracy trade-off,
we propose Cascade Point-Grid Fusion Network (CPGNet), which ensures both
effectiveness and efficiency mainly by the following two techniques: 1) the
novel Point-Grid (PG) fusion block extracts semantic features mainly on the 2D
projected grid for efficiency, while summarizes both 2D and 3D features on 3D
point for minimal information loss; 2) the proposed transformation consistency
loss narrows the gap between the single-time model inference and TTA. The
experiments on the SemanticKITTI and nuScenes benchmarks demonstrate that the
CPGNet without ensemble models or TTA is comparable with the state-of-the-art
RPVNet, while it runs 4.7 times faster.
Related papers
- UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height [2.975860548186652]
Occupancy and 3D object detection are two standard tasks in modern autonomous driving system.
We propose a method to achieve fast 3D object detection and occupancy prediction (UltimateDO)
arXiv Detail & Related papers (2024-09-17T13:14:13Z) - DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions [41.55908366474901]
We introduce a novel approach that harnesses both 2D and 3D attentions to enable highly accurate depth completion.
We evaluate our method, DeCoTR, on established depth completion benchmarks.
arXiv Detail & Related papers (2024-03-18T19:22:55Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction [72.75478398447396]
We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively.
Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system.
We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
arXiv Detail & Related papers (2023-08-31T17:57:17Z) - SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud
Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures.
The proposed approach can be applied to general backbones like PointNet and DGCNN.
Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z) - A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud
Completion [69.32451612060214]
Real-scanned 3D point clouds are often incomplete, and it is important to recover complete point clouds for downstream applications.
Most existing point cloud completion methods use Chamfer Distance (CD) loss for training.
We propose a novel Point Diffusion-Refinement (PDR) paradigm for point cloud completion.
arXiv Detail & Related papers (2021-12-07T06:59:06Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - BEVDetNet: Bird's Eye View LiDAR Point Cloud based Real-time 3D Object
Detection for Autonomous Driving [6.389322215324224]
We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction.
The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation.
The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at IoU=0.5 on KITTI dataset.
arXiv Detail & Related papers (2021-04-21T22:06:39Z) - Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR
Point Clouds [2.924868086534434]
This paper introduces a novel approach for 3D point cloud semantic segmentation that exploits multiple projections of the point cloud.
Our Multi-Projection Fusion framework analyzes spherical and bird's-eye view projections using two separate highly-efficient 2D fully convolutional models.
arXiv Detail & Related papers (2020-11-03T19:40:43Z) - Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid.
We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently.
We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z) - Pointwise Attention-Based Atrous Convolutional Neural Networks [15.499267533387039]
A pointwise attention-based atrous convolutional neural network architecture is proposed to efficiently deal with a large number of points.
The proposed model has been evaluated on the two most important 3D point cloud datasets for the 3D semantic segmentation task.
It achieves a reasonable performance compared to state-of-the-art models in terms of accuracy, with a much smaller number of parameters.
arXiv Detail & Related papers (2019-12-27T13:12:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.