Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes
- URL: http://arxiv.org/abs/2405.19735v2
- Date: Sun, 4 Aug 2024 15:38:41 GMT
- Title: Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes
- Authors: Yong-Qiang Mao, Hanbo Bi, Xuexue Li, Kaiqiang Chen, Zhirui Wang, Xian Sun, Kun Fu,
- Abstract summary: We propose novel convolution operators, termed Twin Deformable point Convolutions (TDConvs)
These operators aim to achieve adaptive feature learning by learning deformable sampling points in the latitude-longitude plane and altitude direction.
Experiments on existing popular benchmarks conclude that our TDConvs achieve the best segmentation performance.
- Score: 12.506628755166814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Thanks to the application of deep learning technology in point cloud processing of the remote sensing field, point cloud segmentation has become a research hotspot in recent years, which can be applied to real-world 3D, smart cities, and other fields. Although existing solutions have made unprecedented progress, they ignore the inherent characteristics of point clouds in remote sensing fields that are strictly arranged according to latitude, longitude, and altitude, which brings great convenience to the segmentation of point clouds in remote sensing fields. To consider this property cleverly, we propose novel convolution operators, termed Twin Deformable point Convolutions (TDConvs), which aim to achieve adaptive feature learning by learning deformable sampling points in the latitude-longitude plane and altitude direction, respectively. First, to model the characteristics of the latitude-longitude plane, we propose a Cylinder-wise Deformable point Convolution (CyDConv) operator, which generates a two-dimensional cylinder map by constructing a cylinder-like grid in the latitude-longitude direction. Furthermore, to better integrate the features of the latitude-longitude plane and the spatial geometric features, we perform a multi-scale fusion of the extracted latitude-longitude features and spatial geometric features, and realize it through the aggregation of adjacent point features of different scales. In addition, a Sphere-wise Deformable point Convolution (SpDConv) operator is introduced to adaptively offset the sampling points in three-dimensional space by constructing a sphere grid structure, aiming at modeling the characteristics in the altitude direction. Experiments on existing popular benchmarks conclude that our TDConvs achieve the best segmentation performance, surpassing the existing state-of-the-art methods.
Related papers
- Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction [72.75478398447396]
We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively.
Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system.
We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
arXiv Detail & Related papers (2023-08-31T17:57:17Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Dual Adaptive Transformations for Weakly Supervised Point Cloud
Segmentation [78.6612285236938]
We propose a novel DAT (textbfDual textbfAdaptive textbfTransformations) model for weakly supervised point cloud segmentation.
We evaluate our proposed DAT model with two popular backbones on the large-scale S3DIS and ScanNet-V2 datasets.
arXiv Detail & Related papers (2022-07-19T05:43:14Z) - DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion [17.797795508707864]
We propose Dual-Scale Point Cloud Recognition with High-frequency Fusion (DSPoint)
We reverse the conventional design of applying convolution on voxels and attention to points.
Experiments and ablations on widely-adopted ModelNet40, ShapeNet, and S3DIS demonstrate the state-of-the-art performance of our DSPoint.
arXiv Detail & Related papers (2021-11-19T17:25:54Z) - Point Cloud Upsampling via Disentangled Refinement [86.3641957163818]
Point clouds produced by 3D scanning are often sparse, non-uniform, and noisy.
Recent upsampling approaches aim to generate a dense point set, while achieving both distribution uniformity and proximity-to-surface.
We formulate two cascaded sub-networks, a dense generator and a spatial refiner.
arXiv Detail & Related papers (2021-06-09T02:58:42Z) - Semantic Segmentation for Real Point Cloud Scenes via Bilateral
Augmentation and Adaptive Fusion [38.05362492645094]
Real point cloud scenes can intuitively capture complex surroundings in the real world, but due to 3D data's raw nature, it is very challenging for machine perception.
We concentrate on the essential visual task, semantic segmentation, for large-scale point cloud data collected in reality.
By comparing with state-of-the-art networks on three different benchmarks, we demonstrate the effectiveness of our network.
arXiv Detail & Related papers (2021-03-12T04:13:20Z) - 3D Object Detection with Pointformer [29.935891419574602]
We propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively.
A Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level.
A Global Transformer is designed to learn context-aware representations at the scene level.
arXiv Detail & Related papers (2020-12-21T15:12:54Z) - Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation
and Spatial Supervision [68.35777836993212]
We propose a Pseudo-LiDAR point cloud network to generate temporally and spatially high-quality point cloud sequences.
By exploiting the scene flow between point clouds, the proposed network is able to learn a more accurate representation of the 3D spatial motion relationship.
arXiv Detail & Related papers (2020-06-20T03:11:04Z) - Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation [38.61801196027949]
We present a learning-based method for interpolating and manipulating 3D shapes represented as point clouds.
Our approach is based on constructing a dual encoding space that enables synthesis shape and, at the same time, provides links to the intrinsic shape information.
arXiv Detail & Related papers (2020-04-03T16:28:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.