Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation
- URL: http://arxiv.org/abs/2301.10100v2
- Date: Mon, 25 Sep 2023 12:10:42 GMT
- Title: Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation
- Authors: Gilles Puy, Alexandre Boulch, Renaud Marlet
- Abstract summary: Sparse 3D convolutions have become the de-facto tools to construct deep neural networks.
We propose an alternative method that reaches the level of state-of-the-art methods without requiring sparse convolutions.
We show that such level of performance is achievable by relying on tools a priori unfit for large scale and high-performing 3D perception.
- Score: 66.6890991207065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation of point clouds in autonomous driving datasets requires
techniques that can process large numbers of points efficiently. Sparse 3D
convolutions have become the de-facto tools to construct deep neural networks
for this task: they exploit point cloud sparsity to reduce the memory and
computational loads and are at the core of today's best methods. In this paper,
we propose an alternative method that reaches the level of state-of-the-art
methods without requiring sparse convolutions. We actually show that such level
of performance is achievable by relying on tools a priori unfit for large scale
and high-performing 3D perception. In particular, we propose a novel 3D
backbone, WaffleIron, made almost exclusively of MLPs and dense 2D convolutions
and present how to train it to reach high performance on SemanticKITTI and
nuScenes. We believe that WaffleIron is a compelling alternative to backbones
using sparse 3D convolutions, especially in frameworks and on hardware where
those convolutions are not readily available.
Related papers
- MinkUNeXt: Point Cloud-based Large-scale Place Recognition using 3D
Sparse Convolutions [1.124958340749622]
MinkUNeXt is an effective and efficient architecture for place-recognition from point clouds entirely based on the new 3D MinkNeXt Block.
A thorough assessment of the proposal has been carried out using the Oxford RobotCar and the In-house datasets.
arXiv Detail & Related papers (2024-03-12T12:25:54Z) - PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained
Image-Language Models [56.324516906160234]
Generalizable 3D part segmentation is important but challenging in vision and robotics.
This paper explores an alternative way for low-shot part segmentation of 3D point clouds by leveraging a pretrained image-language model, GLIP.
We transfer the rich knowledge from 2D to 3D through GLIP-based part detection on point cloud rendering and a novel 2D-to-3D label lifting algorithm.
arXiv Detail & Related papers (2022-12-03T06:59:01Z) - Spatial Pruned Sparse Convolution for Efficient 3D Object Detection [41.62839541489369]
3D scenes are dominated by a large number of background points, which is redundant for the detection task that mainly needs to focus on foreground objects.
In this paper, we analyze major components of existing 3D CNNs and find that 3D CNNs ignore the redundancy of data and further amplify it in the down-sampling process, which brings a huge amount of extra and unnecessary computational overhead.
We propose a new convolution operator named spatial pruned sparse convolution (SPS-Conv), which includes two variants, spatial pruned submanifold sparse convolution (SPSS-Conv) and spatial pruned regular sparse convolution (SPRS
arXiv Detail & Related papers (2022-09-28T16:19:06Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs.
They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion.
For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z) - Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution.
We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids.
The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - SparsePipe: Parallel Deep Learning for 3D Point Clouds [7.181267620981419]
SparsePipe is built to support 3D sparse data such as point clouds.
It exploits intra-batch parallelism that partitions input data into multiple processors.
We show that SparsePipe can parallelize effectively and obtain better performance on current point cloud benchmarks.
arXiv Detail & Related papers (2020-12-27T01:47:09Z) - DV-ConvNet: Fully Convolutional Deep Learning on Point Clouds with
Dynamic Voxelization and 3D Group Convolution [0.7340017786387767]
3D point cloud interpretation is a challenging task due to the randomness and sparsity of the component points.
In this work, we draw attention back to the standard 3D convolutions towards an efficient 3D point cloud interpretation.
Our network is able to run and converge at a considerably fast speed, while yields on-par or even better performance compared with the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2020-09-07T07:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.