Enhanced Semantic Segmentation for Large-Scale and Imbalanced Point Clouds
- URL: http://arxiv.org/abs/2409.13983v1
- Date: Sat, 21 Sep 2024 02:23:01 GMT
- Title: Enhanced Semantic Segmentation for Large-Scale and Imbalanced Point Clouds
- Authors: Haoran Gong, Haodong Wang, Di Wang,
- Abstract summary: Small-sized objects are prone to be under-sampled or misclassified due to their low occurrence frequency.
We propose the Multilateral Cascading Network (MCNet) for large-scale and sample-imbalanced point cloud scenes.
- Score: 6.253217784798542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation of large-scale point clouds is of significant importance in environment perception and scene understanding. However, point clouds collected from real-world environments are usually imbalanced and small-sized objects are prone to be under-sampled or misclassified due to their low occurrence frequency, thereby reducing the overall accuracy of semantic segmentation. In this study, we propose the Multilateral Cascading Network (MCNet) for large-scale and sample-imbalanced point cloud scenes. To increase the frequency of small-sized objects, we introduce the semantic-weighted sampling module, which incorporates a probability parameter into the collected data group. To facilitate feature learning, we propose a Multilateral Cascading Attention Enhancement (MCAE) module to learn complex local features through multilateral cascading operations and attention mechanisms. To promote feature fusion, we propose a Point Cross Stage Partial (P-CSP) module to combine global and local features, optimizing the integration of valuable feature information across multiple scales. Finally, we introduce the neighborhood voting module to integrate results at the output layer. Our proposed method demonstrates either competitive or superior performance relative to state-of-the-art approaches across three widely recognized benchmark datasets: S3DIS, Toronto3D, and SensatUrban with mIoU scores of 74.0\%, 82.9\% and 64.5\%, respectively. Notably, our work yielded consistent optimal results on the under-sampled semantic categories, thereby demonstrating exceptional performance in the recognition of small-sized objects.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation [10.328077317786342]
We propose a Similarity-Weighted Convolution and local-global Fusion Network, named SWCF-Net.
Our method achieves a competitive result with less computational cost, and is able to handle large-scale point clouds efficiently.
arXiv Detail & Related papers (2024-06-17T11:54:46Z) - LACV-Net: Semantic Segmentation of Large-Scale Point Cloud Scene via
Local Adaptive and Comprehensive VLAD [13.907586081922345]
We propose an end-to-end deep neural network called LACV-Net for large-scale point cloud semantic segmentation.
The proposed network contains three main components: 1) a local adaptive feature augmentation module (LAFA) to adaptively learn the similarity of centroids and neighboring points to augment the local context; 2) a comprehensive VLAD module that fuses local features with multi-layer, multi-scale, and multi-resolution to represent a comprehensive global description vector; and 3) an aggregation loss function to effectively optimize the segmentation boundaries by constraining the adaptive weight from the LAFA module.
arXiv Detail & Related papers (2022-10-12T02:11:00Z) - SUNet: Scale-aware Unified Network for Panoptic Segmentation [25.626882426111198]
We propose two lightweight modules to mitigate the problem of segmenting objects of various scales.
We present an end-to-end Scale-aware Unified Network (SUNet) which is more adaptable to multi-scale objects.
arXiv Detail & Related papers (2022-09-07T01:40:41Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Multi-scale Network with Attentional Multi-resolution Fusion for Point
Cloud Semantic Segmentation [2.964101313270572]
We present a comprehensive point cloud semantic segmentation network that aggregates both local and global multi-scale information.
We introduce an Angle Correlation Point Convolution module to effectively learn the local shapes of points.
Third, inspired by HRNet which has excellent performance on 2D image vision tasks, we build an HRNet customized for point cloud to learn global multi-scale context.
arXiv Detail & Related papers (2022-06-27T21:03:33Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Multi-Person Pose Estimation with Enhanced Feature Aggregation and
Selection [33.15192824888279]
We propose a novel Enhanced Feature Aggregation and Selection network (EFASNet) for multi-person 2D human pose estimation.
Our method can well handle crowded, cluttered and occluded scenes.
Comprehensive experiments demonstrate that the proposed approach outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-03-20T08:33:25Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.