Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic
Segmentation
- URL: http://arxiv.org/abs/2201.05972v1
- Date: Sun, 16 Jan 2022 05:34:54 GMT
- Title: Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic
Segmentation
- Authors: Shuangjie Xu, Rui Wan, Maosheng Ye, Xiaoyi Zou, Tongyi Cao
- Abstract summary: We present SCAN, a novel sparse cross-scale attention network to align multi-scale sparse features with global voxel-encoded attention to capture the long-range relationship of instance context.
For the surface-aggregated points, SCAN adopts a novel sparse class-agnostic representation of instance centroids, which can not only maintain the sparsity of aligned features, but also reduce the amount of the network through sparse convolution.
- Score: 12.61753274984776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two major challenges of 3D LiDAR Panoptic Segmentation (PS) are that point
clouds of an object are surface-aggregated and thus hard to model the
long-range dependency especially for large instances, and that objects are too
close to separate each other. Recent literature addresses these problems by
time-consuming grouping processes such as dual-clustering, mean-shift offsets,
etc., or by bird-eye-view (BEV) dense centroid representation that downplays
geometry. However, the long-range geometry relationship has not been
sufficiently modeled by local feature learning from the above methods. To this
end, we present SCAN, a novel sparse cross-scale attention network to first
align multi-scale sparse features with global voxel-encoded attention to
capture the long-range relationship of instance context, which can boost the
regression accuracy of the over-segmented large objects. For the
surface-aggregated points, SCAN adopts a novel sparse class-agnostic
representation of instance centroids, which can not only maintain the sparsity
of aligned features to solve the under-segmentation on small objects, but also
reduce the computation amount of the network through sparse convolution. Our
method outperforms previous methods by a large margin in the SemanticKITTI
dataset for the challenging 3D PS task, achieving 1st place with a real-time
inference speed.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition [32.99080359375706]
ClusteringSDF is a novel approach to achieve both segmentation and reconstruction in 3D via the neural implicit surface representation.
We introduce a high-efficient clustering mechanism for lifting the 2D labels to 3D and the experimental results on the challenging scenes from ScanNet and Replica datasets show that ClusteringSDF can achieve competitive performance.
arXiv Detail & Related papers (2024-03-21T17:59:16Z) - Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - Spatial Pruned Sparse Convolution for Efficient 3D Object Detection [41.62839541489369]
3D scenes are dominated by a large number of background points, which is redundant for the detection task that mainly needs to focus on foreground objects.
In this paper, we analyze major components of existing 3D CNNs and find that 3D CNNs ignore the redundancy of data and further amplify it in the down-sampling process, which brings a huge amount of extra and unnecessary computational overhead.
We propose a new convolution operator named spatial pruned sparse convolution (SPS-Conv), which includes two variants, spatial pruned submanifold sparse convolution (SPSS-Conv) and spatial pruned regular sparse convolution (SPRS
arXiv Detail & Related papers (2022-09-28T16:19:06Z) - Collaborative Propagation on Multiple Instance Graphs for 3D Instance
Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point.
With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information.
Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z) - CPSeg: Cluster-free Panoptic Segmentation of 3D LiDAR Point Clouds [2.891413712995641]
We propose a novel real-time end-to-end panoptic segmentation network for LiDAR point clouds, called CPSeg.
CPSeg comprises a shared encoder, a dual decoder, a task-aware attention module (TAM) and a cluster-free instance segmentation head.
arXiv Detail & Related papers (2021-11-02T16:44:06Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic
Convolution [136.7261709896713]
We propose a data-driven approach that generates the appropriate convolution kernels to apply in response to the nature of the instances.
The proposed method achieves promising results on both ScanetNetV2 and S3DIS.
It also improves inference speed by more than 25% over the current state-of-the-art.
arXiv Detail & Related papers (2020-11-26T14:56:57Z) - Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with
Deep Metric Learning [5.699350798684963]
We propose a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning.
For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects.
We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score.
arXiv Detail & Related papers (2020-07-07T02:17:44Z) - Generative Sparse Detection Networks for 3D Single-shot Object Detection [43.91336826079574]
3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality.
Yet, the sparse nature of the 3D data poses unique challenges to this task.
We propose Generative Sparse Detection Network (GSDN), a fully-convolutional single-shot sparse detection network.
arXiv Detail & Related papers (2020-06-22T15:54:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.