Position-Guided Point Cloud Panoptic Segmentation Transformer
- URL: http://arxiv.org/abs/2303.13509v1
- Date: Thu, 23 Mar 2023 17:59:02 GMT
- Title: Position-Guided Point Cloud Panoptic Segmentation Transformer
- Authors: Zeqi Xiao, Wenwei Zhang, Tai Wang, Chen Change Loy, Dahua Lin,
Jiangmiao Pang
- Abstract summary: This work begins by applying this appealing paradigm to LiDAR-based point cloud segmentation and obtains a simple yet effective baseline.
We observe that instances in the sparse point clouds are relatively small to the whole scene and often have similar geometry but lack distinctive appearance for segmentation, which are rare in the image domain.
The method, named Position-guided Point cloud Panoptic segmentation transFormer (P3Former), outperforms previous state-of-the-art methods by 3.4% and 1.2% on Semantic KITTI and nuScenes benchmark, respectively.
- Score: 118.17651196656178
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: DEtection TRansformer (DETR) started a trend that uses a group of learnable
queries for unified visual perception. This work begins by applying this
appealing paradigm to LiDAR-based point cloud segmentation and obtains a simple
yet effective baseline. Although the naive adaptation obtains fair results, the
instance segmentation performance is noticeably inferior to previous works. By
diving into the details, we observe that instances in the sparse point clouds
are relatively small to the whole scene and often have similar geometry but
lack distinctive appearance for segmentation, which are rare in the image
domain. Considering instances in 3D are more featured by their positional
information, we emphasize their roles during the modeling and design a robust
Mixed-parameterized Positional Embedding (MPE) to guide the segmentation
process. It is embedded into backbone features and later guides the mask
prediction and query update processes iteratively, leading to Position-Aware
Segmentation (PA-Seg) and Masked Focal Attention (MFA). All these designs impel
the queries to attend to specific regions and identify various instances. The
method, named Position-guided Point cloud Panoptic segmentation transFormer
(P3Former), outperforms previous state-of-the-art methods by 3.4% and 1.2% PQ
on SemanticKITTI and nuScenes benchmark, respectively. The source code and
models are available at https://github.com/SmartBot-PJLab/P3Former .
Related papers
- Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - Dynamic Prototype Adaptation with Distillation for Few-shot Point Cloud
Segmentation [32.494146296437656]
Few-shot point cloud segmentation seeks to generate per-point masks for previously unseen categories.
We present dynamic prototype adaptation (DPA), which explicitly learns task-specific prototypes for each query point cloud.
arXiv Detail & Related papers (2024-01-29T11:00:46Z) - EipFormer: Emphasizing Instance Positions in 3D Instance Segmentation [51.996943482875366]
We present a novel Transformer-based architecture, EipFormer, which comprises progressive aggregation and dual position embedding.
EipFormer achieves superior or comparable performance compared to state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-09T16:08:47Z) - CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud
Semantic Segmentation [60.0893353960514]
We study the task of weakly-supervised point cloud semantic segmentation with sparse annotations.
We propose a Contextual Point Cloud Modeling ( CPCM) method that consists of two parts: a region-wise masking (RegionMask) strategy and a contextual masked training (CMT) method.
arXiv Detail & Related papers (2023-07-19T04:41:18Z) - PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise
Semantic Guidance [11.097083846498581]
PSGformer is a novel 3D instance segmentation network.
It incorporates two key advancements to enhance the performance of 3D instance segmentation.
It exceeds compared state-of-the-art methods by 2.2% on ScanNetv2 hidden test set in terms of mAP.
arXiv Detail & Related papers (2023-07-15T04:45:37Z) - Prototype Adaption and Projection for Few- and Zero-shot 3D Point Cloud
Semantic Segmentation [30.18333233940194]
We address the challenging task of few-shot and zero-shot 3D point cloud semantic segmentation.
Our proposed method surpasses state-of-the-art algorithms by a considerable 7.90% and 14.82% under the 2-way 1-shot setting on S3DIS and ScanNet benchmarks, respectively.
arXiv Detail & Related papers (2023-05-23T17:58:05Z) - Few-Shot 3D Point Cloud Semantic Segmentation via Stratified
Class-Specific Attention Based Transformer Network [22.9434434107516]
We develop a new multi-layer transformer network for few-shot point cloud semantic segmentation.
Our method achieves the new state-of-the-art performance, with 15% less inference time, over existing few-shot 3D point cloud segmentation models.
arXiv Detail & Related papers (2023-03-28T00:27:54Z) - Point Cloud Recognition with Position-to-Structure Attention
Transformers [24.74805434602145]
Position-to-Structure Attention Transformers (PS-Former) is a Transformer-based algorithm for 3D point cloud recognition.
PS-Former deals with the challenge in 3D point cloud representation where points are not positioned in a fixed grid structure.
PS-Former demonstrates competitive experimental results on three 3D point cloud tasks including classification, part segmentation, and scene segmentation.
arXiv Detail & Related papers (2022-10-05T05:40:33Z) - SE(3)-Equivariant Attention Networks for Shape Reconstruction in
Function Space [50.14426188851305]
We propose the first SE(3)-equivariant coordinate-based network for learning occupancy fields from point clouds.
In contrast to previous shape reconstruction methods that align the input to a regular grid, we operate directly on the irregular, unoriented point cloud.
We show that our method outperforms previous SO(3)-equivariant methods, as well as non-equivariant methods trained on SO(3)-augmented datasets.
arXiv Detail & Related papers (2022-04-05T17:59:15Z) - UPDesc: Unsupervised Point Descriptor Learning for Robust Registration [54.95201961399334]
UPDesc is an unsupervised method to learn point descriptors for robust point cloud registration.
We show that our learned descriptors yield superior performance over existing unsupervised methods.
arXiv Detail & Related papers (2021-08-05T17:11:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.