LPRnet: A self-supervised registration network for LiDAR and photogrammetric point clouds
- URL: http://arxiv.org/abs/2501.05669v1
- Date: Fri, 10 Jan 2025 02:36:37 GMT
- Title: LPRnet: A self-supervised registration network for LiDAR and photogrammetric point clouds
- Authors: Chen Wang, Yanfeng Gu, Xian Li,
- Abstract summary: LiDAR and photogrammetry are active and passive remote sensing techniques for point cloud acquisition, respectively.<n>Due to the fundamental differences in sensing mechanisms, spatial distributions and coordinate systems, their point clouds exhibit significant discrepancies in density, precision, noise, and overlap.<n>This paper proposes a self-supervised registration network based on a masked autoencoder, focusing on heterogeneous LiDAR and photogrammetric point clouds.
- Score: 38.42527849407057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR and photogrammetry are active and passive remote sensing techniques for point cloud acquisition, respectively, offering complementary advantages and heterogeneous. Due to the fundamental differences in sensing mechanisms, spatial distributions and coordinate systems, their point clouds exhibit significant discrepancies in density, precision, noise, and overlap. Coupled with the lack of ground truth for large-scale scenes, integrating the heterogeneous point clouds is a highly challenging task. This paper proposes a self-supervised registration network based on a masked autoencoder, focusing on heterogeneous LiDAR and photogrammetric point clouds. At its core, the method introduces a multi-scale masked training strategy to extract robust features from heterogeneous point clouds under self-supervision. To further enhance registration performance, a rotation-translation embedding module is designed to effectively capture the key features essential for accurate rigid transformations. Building upon the robust representations, a transformer-based architecture seamlessly integrates local and global features, fostering precise alignment across diverse point cloud datasets. The proposed method demonstrates strong feature extraction capabilities for both LiDAR and photogrammetric point clouds, addressing the challenges of acquiring ground truth at the scene level. Experiments conducted on two real-world datasets validate the effectiveness of the proposed method in solving heterogeneous point cloud registration problems.
Related papers
- Range-Edit: Semantic Mask Guided Outdoor LiDAR Scene Editing [8.309247096255529]
Training autonomous driving and navigation systems requires large and diverse point cloud datasets.<n>Current methods rely on simulating point cloud data within handcrafted 3D virtual environments.<n>This research proposes a novel approach that addresses the problem discussed by editing real-world LiDAR scans.
arXiv Detail & Related papers (2025-11-21T14:16:27Z) - Semantic Segmentation Algorithm Based on Light Field and LiDAR Fusion [23.0804908886806]
We propose the first multimodal semantic segmentation dataset integrating light field data and point cloud data.<n>Our method outperforms image-only segmentation by 1.71 Mean Intersection over Union(mIoU) and point cloud-only segmentation by 2.38 mIoU, demonstrating its effectiveness.
arXiv Detail & Related papers (2025-10-08T06:15:06Z) - Cross3DReg: Towards a Large-scale Real-world Cross-source Point Cloud Registration Benchmark [57.42211080221526]
Cross-source point cloud registration, which aims to align point cloud data from different sensors, is a fundamental task in 3D vision.<n>The lack of publicly available large-scale real-world datasets for training the deep registration models, and the inherent differences in point clouds captured by multiple sensors pose challenges.<n>We construct Cross3DReg, the currently largest and real-world multi-modal cross-source point cloud registration dataset.<n>A visual-geometric attention guided matching module is proposed to enhance the consistency of cross-source point cloud features.
arXiv Detail & Related papers (2025-09-08T09:01:13Z) - Adaptive Point-Prompt Tuning: Fine-Tuning Heterogeneous Foundation Models for 3D Point Cloud Analysis [51.37795317716487]
We propose the Adaptive Point-Prompt Tuning (APPT) method, which fine-tunes pre-trained models with a modest number of parameters.<n>We convert raw point clouds into point embeddings by aggregating local geometry to capture spatial features followed by linear layers.<n>To calibrate self-attention across source domains of any modality to 3D, we introduce a prompt generator that shares weights with the point embedding module.
arXiv Detail & Related papers (2025-08-30T06:02:21Z) - Single-Frame Point-Pixel Registration via Supervised Cross-Modal Feature Matching [7.5461100059974315]
We introduce a detector-free framework for direct point-pixel matching between LiDAR and camera views.<n>Specifically, we project the LiDAR intensity map into a 2D view from the LiDAR perspective and feed it into an attention-based matching network.<n>To further enhance matching reliability, we introduce a repeatability scoring mechanism that acts as a soft visibility prior.
arXiv Detail & Related papers (2025-06-28T06:57:13Z) - PAPI-Reg: Patch-to-Pixel Solution for Efficient Cross-Modal Registration between LiDAR Point Cloud and Camera Image [10.906218491083576]
Cross-modal data fusion involves the precise alignment of data from different sensors.
We propose a framework that projects point clouds into several 2D representations for matching with camera images.
To tackle the challenges of cross modal differences and the limited overlap between LiDAR point clouds and images in the image matching task, we introduce a multi-scale feature extraction network.
arXiv Detail & Related papers (2025-03-19T15:04:01Z) - Towards Fusing Point Cloud and Visual Representations for Imitation Learning [57.886331184389604]
We propose FPV-Net, a novel imitation learning method that effectively combines the strengths of both point cloud and RGB modalities.
Our method conditions the point-cloud encoder on global and local image tokens using adaptive layer norm conditioning.
arXiv Detail & Related papers (2025-02-17T20:46:54Z) - Mitigating Prior Shape Bias in Point Clouds via Differentiable Center Learning [19.986150101882217]
We introduce a novel solution called the Differentiable Center Sampling Network (DCS-Net)
It tackles the information leakage problem by incorporating both global feature reconstruction and local feature reconstruction as non-trivial proxy tasks.
Experimental results demonstrate that our method enhances the expressive capacity of existing point cloud models.
arXiv Detail & Related papers (2024-02-03T08:58:23Z) - Distribution-aware Interactive Attention Network and Large-scale Cloud
Recognition Benchmark on FY-4A Satellite Image [24.09239785062109]
We develop a novel dataset for accurate cloud recognition.
We use domain adaptation methods to align 70,419 image-label pairs in terms of projection, temporal resolution, and spatial resolution.
We also introduce a Distribution-aware Interactive-Attention Network (DIAnet), which preserves pixel-level details through a high-resolution branch and a parallel cross-branch.
arXiv Detail & Related papers (2024-01-06T09:58:09Z) - CLiSA: A Hierarchical Hybrid Transformer Model using Orthogonal Cross
Attention for Satellite Image Cloud Segmentation [5.178465447325005]
Deep learning algorithms have emerged as promising approach to solve image segmentation problems.
In this paper, we introduce a deep-learning model for effective cloud mask generation named CLiSA - Cloud segmentation via Lipschitz Stable Attention network.
We demonstrate both qualitative and quantitative outcomes for multiple satellite image datasets including Landsat-8, Sentinel-2, and Cartosat-2s.
arXiv Detail & Related papers (2023-11-29T09:31:31Z) - Controllable Mesh Generation Through Sparse Latent Point Diffusion
Models [105.83595545314334]
We design a novel sparse latent point diffusion model for mesh generation.
Our key insight is to regard point clouds as an intermediate representation of meshes, and model the distribution of point clouds instead.
Our proposed sparse latent point diffusion model achieves superior performance in terms of generation quality and controllability.
arXiv Detail & Related papers (2023-03-14T14:25:29Z) - Data-driven Cloud Clustering via a Rotationally Invariant Autoencoder [10.660968055962325]
We describe an automated rotation-invariant cloud clustering (RICC) method.
It organizes cloud imagery within large datasets in an unsupervised fashion.
Results suggest that the resultant cloud clusters capture meaningful aspects of cloud physics.
arXiv Detail & Related papers (2021-03-08T16:45:14Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z) - ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework
for LiDAR Point Cloud Segmentation [111.56730703473411]
Training deep neural networks (DNNs) on LiDAR data requires large-scale point-wise annotations.
Simulation-to-real domain adaptation (SRDA) trains a DNN using unlimited synthetic data with automatically generated labels.
ePointDA consists of three modules: self-supervised dropout noise rendering, statistics-invariant and spatially-adaptive feature alignment, and transferable segmentation learning.
arXiv Detail & Related papers (2020-09-07T23:46:08Z) - Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation
and Spatial Supervision [68.35777836993212]
We propose a Pseudo-LiDAR point cloud network to generate temporally and spatially high-quality point cloud sequences.
By exploiting the scene flow between point clouds, the proposed network is able to learn a more accurate representation of the 3D spatial motion relationship.
arXiv Detail & Related papers (2020-06-20T03:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.