Dynamic Clustering Transformer Network for Point Cloud Segmentation
- URL: http://arxiv.org/abs/2306.08073v1
- Date: Tue, 30 May 2023 01:11:05 GMT
- Title: Dynamic Clustering Transformer Network for Point Cloud Segmentation
- Authors: Dening Lu, Jun Zhou, Kyle Yilin Gao, Dilong Li, Jing Du, Linlin Xu,
Jonathan Li
- Abstract summary: We propose a novel 3D point cloud representation network, called Dynamic Clustering Transformer Network (DCTNet)
It has an encoder-decoder architecture, allowing for both local and global feature learning.
Our method was evaluated on an object-based dataset (ShapeNet), an urban navigation dataset (Toronto-3D), and a multispectral LiDAR dataset.
- Score: 23.149220817575195
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Point cloud segmentation is one of the most important tasks in computer
vision with widespread scientific, industrial, and commercial applications. The
research thereof has resulted in many breakthroughs in 3D object and scene
understanding. Previous methods typically utilized hierarchical architectures
for feature representation. However, the commonly used sampling and grouping
methods in hierarchical networks are only based on point-wise three-dimensional
coordinates, ignoring local semantic homogeneity of point clusters.
Additionally, the prevalent Farthest Point Sampling (FPS) method is often a
computational bottleneck. To address these issues, we propose a novel 3D point
cloud representation network, called Dynamic Clustering Transformer Network
(DCTNet). It has an encoder-decoder architecture, allowing for both local and
global feature learning. Specifically, we propose novel semantic feature-based
dynamic sampling and clustering methods in the encoder, which enables the model
to be aware of local semantic homogeneity for local feature aggregation.
Furthermore, in the decoder, we propose an efficient semantic feature-guided
upsampling method. Our method was evaluated on an object-based dataset
(ShapeNet), an urban navigation dataset (Toronto-3D), and a multispectral LiDAR
dataset, verifying the performance of DCTNet across a wide variety of practical
engineering applications. The inference speed of DCTNet is 3.8-16.8$\times$
faster than existing State-of-the-Art (SOTA) models on the ShapeNet dataset,
while achieving an instance-wise mIoU of $86.6\%$, the current top score. Our
method similarly outperforms previous methods on the other datasets, verifying
it as the new State-of-the-Art in point cloud segmentation.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - PointeNet: A Lightweight Framework for Effective and Efficient Point
Cloud Analysis [28.54939134635978]
PointeNet is a network designed specifically for point cloud analysis.
Our method demonstrates flexibility by seamlessly integrating with a classification/segmentation head or embedding into off-the-shelf 3D object detection networks.
Experiments on object-level datasets, including ModelNet40, ScanObjectNN, ShapeNet KITTI, and the scene-level dataset KITTI, demonstrate the superior performance of PointeNet over state-of-the-art methods in point cloud analysis.
arXiv Detail & Related papers (2023-12-20T03:34:48Z) - Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis.
Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space.
Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z) - ISBNet: a 3D Point Cloud Instance Segmentation Network with
Instance-aware Sampling and Box-aware Dynamic Convolution [14.88505076974645]
ISBNet is a novel method that represents instances as kernels and decodes instance masks via dynamic convolution.
We set new state-of-the-art results on ScanNetV2 (55.9), S3DIS (60.8), S3LS3D (49.2) in terms of AP and retains fast inference time (237ms per scene on ScanNetV2.
arXiv Detail & Related papers (2023-03-01T06:06:28Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z) - LatticeNet: Fast Spatio-Temporal Point Cloud Segmentation Using
Permutohedral Lattices [27.048998326468688]
Deep convolutional neural networks (CNNs) have shown outstanding performance in the task of semantically segmenting images.
Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes raw point clouds as input.
We present results of 3D segmentation on multiple datasets where our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-08-09T10:17:27Z) - Learning point embedding for 3D data processing [2.12121796606941]
Current point-based methods are essentially spatial relationship processing networks.
Our architecture, PE-Net, learns the representation of point clouds in high-dimensional space.
Experiments show that PE-Net achieves the state-of-the-art performance in multiple challenging datasets.
arXiv Detail & Related papers (2021-07-19T00:25:28Z) - Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution.
We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids.
The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid.
We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently.
We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.