Related papers: Spatial Layout Consistency for 3D Semantic Segmentation

Spatial Layout Consistency for 3D Semantic Segmentation

URL: http://arxiv.org/abs/2303.00939v1
Date: Thu, 2 Mar 2023 03:24:21 GMT
Title: Spatial Layout Consistency for 3D Semantic Segmentation
Authors: Maryam Jameela, Gunho Sohn
Abstract summary: We introduce a novel deep convolutional neural network (DCNN) technique for achieving voxel-based semantic segmentation of the ALTM's point clouds. The suggested deep learning method, Semantic Utility Network (SUNet) is a multi-dimensional and multi-resolution network. Our experiments demonstrated that SUNet's spatial layout consistency and a multi-resolution feature aggregation could significantly improve performance.
Score: 0.7614628596146599
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to the aged nature of much of the utility network infrastructure, developing a robust and trustworthy computer vision system capable of inspecting it with minimal human intervention has attracted considerable research attention. The airborne laser terrain mapping (ALTM) system quickly becomes the central data collection system among the numerous available sensors. Its ability to penetrate foliage with high-powered energy provides wide coverage and achieves survey-grade ranging accuracy. However, the post-data acquisition process for classifying the ALTM's dense and irregular point clouds is a critical bottleneck that must be addressed to improve efficiency and accuracy. We introduce a novel deep convolutional neural network (DCNN) technique for achieving voxel-based semantic segmentation of the ALTM's point clouds. The suggested deep learning method, Semantic Utility Network (SUNet) is a multi-dimensional and multi-resolution network. SUNet combines two networks: one classifies point clouds at multi-resolution with object categories in three dimensions and another predicts two-dimensional regional labels distinguishing corridor regions from non-corridors. A significant innovation of the SUNet is that it imposes spatial layout consistency on the outcomes of voxel-based and regional segmentation results. The proposed multi-dimensional DCNN combines hierarchical context for spatial layout embedding with a coarse-to-fine strategy. We conducted a comprehensive ablation study to test SUNet's performance using 67 km x 67 km of utility corridor data at a density of 5pp/m2. Our experiments demonstrated that SUNet's spatial layout consistency and a multi-resolution feature aggregation could significantly improve performance, outperforming the SOTA baseline network and achieving a good F1 score for pylon 89%, ground 99%, vegetation 99% and powerline 98% classes.

Related papers

Ground Awareness in Deep Learning for Large Outdoor Point Cloud Segmentation [0.0]
In dense outdoor point clouds, the receptive field of a machine learning model may be too small to accurately determine the surroundings and context of a point. By computing Digital Terrain Models (DTMs) from the point clouds, we extract the relative elevation feature, which is the vertical distance from the terrain to a point. RandLA-Net is employed for efficient semantic segmentation of large-scale point clouds.
arXiv Detail & Related papers (2025-01-30T10:27:28Z)
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN) PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z)
Active search and coverage using point-cloud reinforcement learning [50.741409008225766]
This paper presents an end-to-end deep reinforcement learning solution for target search and coverage. We show that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points. We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome.
arXiv Detail & Related papers (2023-12-18T18:16:30Z)
DANet: Density Adaptive Convolutional Network with Interactive Attention for 3D Point Clouds [30.54110361164338]
Local features and contextual dependencies are crucial for 3D point cloud analysis. We propose density adaptive convolution, coined DAConv. IAM embeds spatial information into channel attention along different spatial directions.
arXiv Detail & Related papers (2023-03-08T09:46:31Z)
LACV-Net: Semantic Segmentation of Large-Scale Point Cloud Scene via Local Adaptive and Comprehensive VLAD [13.907586081922345]
We propose an end-to-end deep neural network called LACV-Net for large-scale point cloud semantic segmentation. The proposed network contains three main components: 1) a local adaptive feature augmentation module (LAFA) to adaptively learn the similarity of centroids and neighboring points to augment the local context; 2) a comprehensive VLAD module that fuses local features with multi-layer, multi-scale, and multi-resolution to represent a comprehensive global description vector; and 3) an aggregation loss function to effectively optimize the segmentation boundaries by constraining the adaptive weight from the LAFA module.
arXiv Detail & Related papers (2022-10-12T02:11:00Z)
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z)
An Attention-Fused Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery [26.362854938949923]
We propose a novel convolutional neural network architecture, named attention-fused network (AFNet) We achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and the ISPRS Potsdam 2D dataset.
arXiv Detail & Related papers (2021-05-10T06:23:27Z)
FatNet: A Feature-attentive Network for 3D Point Cloud Processing [1.502579291513768]
We introduce a novel feature-attentive neural network layer, a FAT layer, that combines both global point-based features and local edge-based features in order to generate better embeddings. Our architecture achieves state-of-the-art results on the task of point cloud classification, as demonstrated on the ModelNet40 dataset.
arXiv Detail & Related papers (2021-04-07T23:13:56Z)
PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving. Current approaches suffer from sparse and partial point clouds of distant and occluded objects. In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z)
DiffusionNet: Discretization Agnostic Learning on Surfaces [48.658589779470454]
We introduce a new approach to deep learning on 3D surfaces, based on the insight that a simple diffusion layer is highly effective for spatial communication. The resulting networks automatically generalize across different samplings and resolutions of a surface. We focus primarily on triangle mesh surfaces, and demonstrate state-of-the-art results for a variety of tasks including surface classification, segmentation, and non-rigid correspondence.
arXiv Detail & Related papers (2020-12-01T23:24:22Z)
MNEW: Multi-domain Neighborhood Embedding and Weighting for Sparse Point Clouds Segmentation [1.2380933178502298]
We propose MNEW, including multi-domain neighborhood embedding, and attention weighting based on their geometry distance, feature similarity, and neighborhood sparsity. MNEW achieves the top performance for sparse point clouds, which is important to the application of LiDAR-based automated driving perception.
arXiv Detail & Related papers (2020-04-05T18:02:07Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.