Related papers: CPGNet: Cascade Point-Grid Fusion Network for Real-Time LiDAR Semantic Segmentation

CPGNet: Cascade Point-Grid Fusion Network for Real-Time LiDAR Semantic Segmentation

URL: http://arxiv.org/abs/2204.09914v1
Date: Thu, 21 Apr 2022 06:56:30 GMT
Title: CPGNet: Cascade Point-Grid Fusion Network for Real-Time LiDAR Semantic Segmentation
Authors: Xiaoyan Li, Gang Zhang, Hongyu Pan, Zhenhua Wang
Abstract summary: We propose Cascade Point-Grid Fusion Network (CPGNet), which ensures both effectiveness and efficiency. CPGNet without ensemble models or TTA is comparable with the state-of-the-art RPVNet, while it runs 4.7 times faster.
Score: 8.944151935020992
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LiDAR semantic segmentation essential for advanced autonomous driving is required to be accurate, fast, and easy-deployed on mobile platforms. Previous point-based or sparse voxel-based methods are far away from real-time applications since time-consuming neighbor searching or sparse 3D convolution are employed. Recent 2D projection-based methods, including range view and multi-view fusion, can run in real time, but suffer from lower accuracy due to information loss during the 2D projection. Besides, to improve the performance, previous methods usually adopt test time augmentation (TTA), which further slows down the inference process. To achieve a better speed-accuracy trade-off, we propose Cascade Point-Grid Fusion Network (CPGNet), which ensures both effectiveness and efficiency mainly by the following two techniques: 1) the novel Point-Grid (PG) fusion block extracts semantic features mainly on the 2D projected grid for efficiency, while summarizes both 2D and 3D features on 3D point for minimal information loss; 2) the proposed transformation consistency loss narrows the gap between the single-time model inference and TTA. The experiments on the SemanticKITTI and nuScenes benchmarks demonstrate that the CPGNet without ensemble models or TTA is comparable with the state-of-the-art RPVNet, while it runs 4.7 times faster.

Related papers

FLARES: Fast and Accurate LiDAR Multi-Range Semantic Segmentation [52.89847760590189]
3D scene understanding is a critical yet challenging task in autonomous driving. Recent methods leverage the range-view representation to improve processing efficiency. We re-design the workflow for range-view-based LiDAR semantic segmentation.
arXiv Detail & Related papers (2025-02-13T12:39:26Z)
Fast Occupancy Network [15.759329665907229]
Occupancy Network predicts category of voxel in specified 3D space around ego vehicle. We present a simple and fast Occupancy Network model, which adopts a deformable 2D convolutional layer to lift BEV feature to 3D voxel feature. We also present an efficient voxel feature pyramid network (FPN) module to improve performance with few computational cost.
arXiv Detail & Related papers (2024-12-10T03:46:03Z)
UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height [2.975860548186652]
Occupancy and 3D object detection are two standard tasks in modern autonomous driving system. We propose a method to achieve fast 3D object detection and occupancy prediction (UltimateDO)
arXiv Detail & Related papers (2024-09-17T13:14:13Z)
DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions [41.55908366474901]
We introduce a novel approach that harnesses both 2D and 3D attentions to enable highly accurate depth completion. We evaluate our method, DeCoTR, on established depth completion benchmarks.
arXiv Detail & Related papers (2024-03-18T19:22:55Z)
ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames. Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z)
PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction [72.75478398447396]
We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively. Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system. We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
arXiv Detail & Related papers (2023-08-31T17:57:17Z)
SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures. The proposed approach can be applied to general backbones like PointNet and DGCNN. Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z)
A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion [69.32451612060214]
Real-scanned 3D point clouds are often incomplete, and it is important to recover complete point clouds for downstream applications. Most existing point cloud completion methods use Chamfer Distance (CD) loss for training. We propose a novel Point Diffusion-Refinement (PDR) paradigm for point cloud completion.
arXiv Detail & Related papers (2021-12-07T06:59:06Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution. A natural remedy is to utilize the 3D voxelization and 3D convolution network. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z)
BEVDetNet: Bird's Eye View LiDAR Point Cloud based Real-time 3D Object Detection for Autonomous Driving [6.389322215324224]
We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction. The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation. The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at IoU=0.5 on KITTI dataset.
arXiv Detail & Related papers (2021-04-21T22:06:39Z)
Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds [2.924868086534434]
This paper introduces a novel approach for 3D point cloud semantic segmentation that exploits multiple projections of the point cloud. Our Multi-Projection Fusion framework analyzes spherical and bird's-eye view projections using two separate highly-efficient 2D fully convolutional models.
arXiv Detail & Related papers (2020-11-03T19:40:43Z)
Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid. We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently. We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z)
Pointwise Attention-Based Atrous Convolutional Neural Networks [15.499267533387039]
A pointwise attention-based atrous convolutional neural network architecture is proposed to efficiently deal with a large number of points. The proposed model has been evaluated on the two most important 3D point cloud datasets for the 3D semantic segmentation task. It achieves a reasonable performance compared to state-of-the-art models in terms of accuracy, with a much smaller number of parameters.
arXiv Detail & Related papers (2019-12-27T13:12:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.