Multilateral Cascading Network for Semantic Segmentation of Large-Scale Outdoor Point Clouds
- URL: http://arxiv.org/abs/2409.13983v2
- Date: Sun, 15 Dec 2024 05:18:36 GMT
- Title: Multilateral Cascading Network for Semantic Segmentation of Large-Scale Outdoor Point Clouds
- Authors: Haoran Gong, Haodong Wang, Di Wang,
- Abstract summary: Multilateral Cascading Network (MCNet) designed to address this challenge.
MCNet comprises two key components: a Multilateral Cascading Attention Enhancement (MCAE) module, and a Point Cross Stage Partial (P-CSP) module.
Our results surpassed the current best result by 2.1% in overall mIoU and yielded an improvement of 15.9% on average for small-sample object categories.
- Score: 6.253217784798542
- License:
- Abstract: Semantic segmentation of large-scale outdoor point clouds is of significant importance in environment perception and scene understanding. However, this task continues to present a significant research challenge, due to the inherent complexity of outdoor objects and their diverse distributions in real-world environments. In this study, we propose the Multilateral Cascading Network (MCNet) designed to address this challenge. The model comprises two key components: a Multilateral Cascading Attention Enhancement (MCAE) module, which facilitates the learning of complex local features through multilateral cascading operations; and a Point Cross Stage Partial (P-CSP) module, which fuses global and local features, thereby optimizing the integration of valuable feature information across multiple scales. Our proposed method demonstrates superior performance relative to state-of-the-art approaches across two widely recognized benchmark datasets: Toronto3D and SensatUrban. Especially on the city-scale SensatUrban dataset, our results surpassed the current best result by 2.1\% in overall mIoU and yielded an improvement of 15.9\% on average for small-sample object categories comprising less than 2\% of the total samples, in comparison to the baseline method.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation [10.328077317786342]
We propose a Similarity-Weighted Convolution and local-global Fusion Network, named SWCF-Net.
Our method achieves a competitive result with less computational cost, and is able to handle large-scale point clouds efficiently.
arXiv Detail & Related papers (2024-06-17T11:54:46Z) - Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for
Cross-City Semantic Segmentation using High-Resolution Domain Adaptation
Networks [82.82866901799565]
We build a new set of multimodal remote sensing benchmark datasets (including hyperspectral, multispectral, SAR) for the study purpose of the cross-city semantic segmentation task.
Beyond the single city, we propose a high-resolution domain adaptation network, HighDAN, to promote the AI model's generalization ability from the multi-city environments.
HighDAN is capable of retaining the spatially topological structure of the studied urban scene well in a parallel high-to-low resolution fusion fashion.
arXiv Detail & Related papers (2023-09-26T23:55:39Z) - Coupling Global Context and Local Contents for Weakly-Supervised
Semantic Segmentation [54.419401869108846]
We propose a single-stage WeaklySupervised Semantic (WSSS) model with only the image-level class label supervisions.
A flexible context aggregation module is proposed to capture the global object context in different granular spaces.
A semantically consistent feature fusion module is proposed in a bottom-up parameter-learnable fashion to aggregate the fine-grained local contents.
arXiv Detail & Related papers (2023-04-18T15:29:23Z) - DuAT: Dual-Aggregation Transformer Network for Medical Image
Segmentation [21.717520350930705]
Transformer-based models have been widely demonstrated to be successful in computer vision tasks.
However, they are often dominated by features of large patterns leading to the loss of local details.
We propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs.
Our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images, and polyps in colonoscopy images.
arXiv Detail & Related papers (2022-12-21T07:54:02Z) - LACV-Net: Semantic Segmentation of Large-Scale Point Cloud Scene via
Local Adaptive and Comprehensive VLAD [13.907586081922345]
We propose an end-to-end deep neural network called LACV-Net for large-scale point cloud semantic segmentation.
The proposed network contains three main components: 1) a local adaptive feature augmentation module (LAFA) to adaptively learn the similarity of centroids and neighboring points to augment the local context; 2) a comprehensive VLAD module that fuses local features with multi-layer, multi-scale, and multi-resolution to represent a comprehensive global description vector; and 3) an aggregation loss function to effectively optimize the segmentation boundaries by constraining the adaptive weight from the LAFA module.
arXiv Detail & Related papers (2022-10-12T02:11:00Z) - SUNet: Scale-aware Unified Network for Panoptic Segmentation [25.626882426111198]
We propose two lightweight modules to mitigate the problem of segmenting objects of various scales.
We present an end-to-end Scale-aware Unified Network (SUNet) which is more adaptable to multi-scale objects.
arXiv Detail & Related papers (2022-09-07T01:40:41Z) - Multi-scale Network with Attentional Multi-resolution Fusion for Point
Cloud Semantic Segmentation [2.964101313270572]
We present a comprehensive point cloud semantic segmentation network that aggregates both local and global multi-scale information.
We introduce an Angle Correlation Point Convolution module to effectively learn the local shapes of points.
Third, inspired by HRNet which has excellent performance on 2D image vision tasks, we build an HRNet customized for point cloud to learn global multi-scale context.
arXiv Detail & Related papers (2022-06-27T21:03:33Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - CARAFE++: Unified Content-Aware ReAssembly of FEatures [132.49582482421246]
We propose unified Content-Aware ReAssembly of FEatures (CARAFE++), a universal, lightweight and highly effective operator to fulfill this goal.
CARAFE++ generates adaptive kernels on-the-fly to enable instance-specific content-aware handling.
It shows consistent and substantial gains across all the tasks with negligible computational overhead.
arXiv Detail & Related papers (2020-12-07T07:34:57Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.