SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation
- URL: http://arxiv.org/abs/2406.11441v1
- Date: Mon, 17 Jun 2024 11:54:46 GMT
- Title: SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation
- Authors: Zhenchao Lin, Li He, Hongqiang Yang, Xiaoqun Sun, Cuojin Zhang, Weinan Chen, Yisheng Guan, Hong Zhang,
- Abstract summary: We propose a Similarity-Weighted Convolution and local-global Fusion Network, named SWCF-Net.
Our method achieves a competitive result with less computational cost, and is able to handle large-scale point clouds efficiently.
- Score: 10.328077317786342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale point cloud consists of a multitude of individual objects, thereby encompassing rich structural and underlying semantic contextual information, resulting in a challenging problem in efficiently segmenting a point cloud. Most existing researches mainly focus on capturing intricate local features without giving due consideration to global ones, thus failing to leverage semantic context. In this paper, we propose a Similarity-Weighted Convolution and local-global Fusion Network, named SWCF-Net, which takes into account both local and global features. We propose a Similarity-Weighted Convolution (SWConv) to effectively extract local features, where similarity weights are incorporated into the convolution operation to enhance the generalization capabilities. Then, we employ a downsampling operation on the K and V channels within the attention module, thereby reducing the quadratic complexity to linear, enabling the Transformer to deal with large-scale point clouds. At last, orthogonal components are extracted in the global features and then aggregated with local features, thereby eliminating redundant information between local and global features and consequently promoting efficiency. We evaluate SWCF-Net on large-scale outdoor datasets SemanticKITTI and Toronto3D. Our experimental results demonstrate the effectiveness of the proposed network. Our method achieves a competitive result with less computational cost, and is able to handle large-scale point clouds efficiently.
Related papers
- Enhanced Semantic Segmentation for Large-Scale and Imbalanced Point Clouds [6.253217784798542]
Small-sized objects are prone to be under-sampled or misclassified due to their low occurrence frequency.
We propose the Multilateral Cascading Network (MCNet) for large-scale and sample-imbalanced point cloud scenes.
arXiv Detail & Related papers (2024-09-21T02:23:01Z) - ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection [65.59969454655996]
We propose an efficient change detection framework, ELGC-Net, which leverages rich contextual information to precisely estimate change regions.
Our proposed ELGC-Net sets a new state-of-the-art performance in remote sensing change detection benchmarks.
We also introduce ELGC-Net-LW, a lighter variant with significantly reduced computational complexity, suitable for resource-constrained settings.
arXiv Detail & Related papers (2024-03-26T17:46:25Z) - APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud
Understanding [20.87092793669536]
Transformer-based networks have achieved impressive performance in 3D point cloud understanding.
To tackle these problems, we propose Asymmetric Parallel Point Transformer (APPT)
APPT is able to capture features globally throughout the entire network while focusing on local-detailed features.
arXiv Detail & Related papers (2023-03-31T06:11:02Z) - LACV-Net: Semantic Segmentation of Large-Scale Point Cloud Scene via
Local Adaptive and Comprehensive VLAD [13.907586081922345]
We propose an end-to-end deep neural network called LACV-Net for large-scale point cloud semantic segmentation.
The proposed network contains three main components: 1) a local adaptive feature augmentation module (LAFA) to adaptively learn the similarity of centroids and neighboring points to augment the local context; 2) a comprehensive VLAD module that fuses local features with multi-layer, multi-scale, and multi-resolution to represent a comprehensive global description vector; and 3) an aggregation loss function to effectively optimize the segmentation boundaries by constraining the adaptive weight from the LAFA module.
arXiv Detail & Related papers (2022-10-12T02:11:00Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Multi-scale Network with Attentional Multi-resolution Fusion for Point
Cloud Semantic Segmentation [2.964101313270572]
We present a comprehensive point cloud semantic segmentation network that aggregates both local and global multi-scale information.
We introduce an Angle Correlation Point Convolution module to effectively learn the local shapes of points.
Third, inspired by HRNet which has excellent performance on 2D image vision tasks, we build an HRNet customized for point cloud to learn global multi-scale context.
arXiv Detail & Related papers (2022-06-27T21:03:33Z) - Conformer: Local Features Coupling Global Representations for Visual
Recognition [72.9550481476101]
We propose a hybrid network structure, termed Conformer, to take advantage of convolutional operations and self-attention mechanisms for enhanced representation learning.
Experiments show that Conformer, under the comparable parameter complexity, outperforms the visual transformer (DeiT-B) by 2.3% on ImageNet.
arXiv Detail & Related papers (2021-05-09T10:00:03Z) - Learning to Predict Context-adaptive Convolution for Semantic
Segmentation [66.27139797427147]
Long-range contextual information is essential for achieving high-performance semantic segmentation.
We propose a Context-adaptive Convolution Network (CaC-Net) to predict a spatially-varying feature weighting vector.
Our CaC-Net achieves superior segmentation performance on three public datasets.
arXiv Detail & Related papers (2020-04-17T13:09:17Z) - LRC-Net: Learning Discriminative Features on Point Clouds by Encoding
Local Region Contexts [65.79931333193016]
We present a novel Local-Region-Context Network (LRC-Net) to learn discriminative features on point clouds.
LRC-Net encodes fine-grained contexts inside and among local regions simultaneously.
Results show LRC-Net is competitive with state-of-the-art methods in shape classification and shape segmentation applications.
arXiv Detail & Related papers (2020-03-18T14:34:08Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.