Related papers: City-scale Incremental Neural Mapping with Three-layer Sampling and Panoptic Representation

City-scale Incremental Neural Mapping with Three-layer Sampling and Panoptic Representation

URL: http://arxiv.org/abs/2209.14072v2
Date: Wed, 12 Apr 2023 12:06:09 GMT
Title: City-scale Incremental Neural Mapping with Three-layer Sampling and Panoptic Representation
Authors: Yongliang Shi, Runyi Yang, Pengfei Li, Zirui Wu, Hao Zhao, Guyue Zhou
Abstract summary: We build a city-scale continual neural mapping system with a panoptic representation that consists of environment-level and instance-level modelling. Given a stream of sparse LiDAR point cloud, it maintains a dynamic generative model that maps 3D coordinates to signed distance field (SDF) values. To realize high fidelity mapping of instance under incomplete observation, category-specific prior is introduced to better model the geometric details.
Score: 5.682979644056021
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural implicit representations are drawing a lot of attention from the robotics community recently, as they are expressive, continuous and compact. However, city-scale continual implicit dense mapping based on sparse LiDAR input is still an under-explored challenge. To this end, we successfully build a city-scale continual neural mapping system with a panoptic representation that consists of environment-level and instance-level modelling. Given a stream of sparse LiDAR point cloud, it maintains a dynamic generative model that maps 3D coordinates to signed distance field (SDF) values. To address the difficulty of representing geometric information at different levels in city-scale space, we propose a tailored three-layer sampling strategy to dynamically sample the global, local and near-surface domains. Meanwhile, to realize high fidelity mapping of instance under incomplete observation, category-specific prior is introduced to better model the geometric details. We evaluate on the public SemanticKITTI dataset and demonstrate the significance of the newly proposed three-layer sampling strategy and panoptic representation, using both quantitative and qualitative results. Codes and model will be publicly available.

Related papers

Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion [18.943643720564996]
Sat2City is a novel framework that synergizes the representational capacity of sparse voxel grids with latent diffusion models.<n>We introduce a dataset of synthesized large-scale 3D cities paired with satellite-view height maps.<n>Our framework generates detailed 3D structures from a single satellite image, achieving superior fidelity compared to existing city generation models.
arXiv Detail & Related papers (2025-07-06T14:30:08Z)
Cross-Modal Geometric Hierarchy Fusion: An Implicit-Submap Driven Framework for Resilient 3D Place Recognition [4.196626042312499]
We propose a novel framework that redefines 3D place recognition through density-agnostic geometric reasoning.<n>Specifically, we introduce an implicit 3D representation based on elastic points, which is immune to the interference of original scene point cloud density.<n>With the aid of these two types of information, we obtain descriptors that fuse geometric information from both bird's-eye view and 3D segment perspectives.
arXiv Detail & Related papers (2025-06-17T07:04:07Z)
Hierarchical Scoring with 3D Gaussian Splatting for Instance Image-Goal Navigation [27.040017548286812]
Instance Image-Goal Navigation (IIN) requires autonomous agents to identify and navigate to a target object or location depicted in a reference image captured from any viewpoint.<n>We introduce a novel IIN framework with a hierarchical scoring paradigm that estimates optimal viewpoints for target matching.
arXiv Detail & Related papers (2025-06-09T00:58:14Z)
Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions. Our approach uses diffusion models with a novel network architecture to learn surface point distributions. We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns. A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z)
Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata [70.9375320609781]
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV) We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
arXiv Detail & Related papers (2024-06-12T14:56:56Z)
Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding [11.416392706435415]
Zero-shot 3D point cloud understanding can be achieved via 2D Vision-Language Models (VLMs) Existing strategies directly map Vision-Language Models from 2D pixels of rendered or captured views to 3D points, overlooking the inherent and expressible point cloud geometric structure. We introduce the first training-free aggregation technique that leverages the point cloud's 3D geometric structure to improve the quality of the transferred Vision-Language Models.
arXiv Detail & Related papers (2023-12-04T12:30:07Z)
LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes [2.822816116516042]
Large-scale semantic mapping is crucial for outdoor autonomous agents to fulfill high-level tasks such as planning and navigation. This paper proposes a novel method for large-scale 3D semantic reconstruction through implicit representations from posed LiDAR measurements alone.
arXiv Detail & Related papers (2023-11-04T03:55:38Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
Deep Implicit Surface Point Prediction Networks [49.286550880464866]
Deep neural representations of 3D shapes as implicit functions have been shown to produce high fidelity models. This paper presents a novel approach that models such surfaces using a new class of implicit representations called the closest surface-point (CSP) representation.
arXiv Detail & Related papers (2021-06-10T14:31:54Z)
Self-supervised Depth Estimation Leveraging Global Perception and Geometric Smoothness Using On-board Videos [0.5276232626689566]
We present DLNet for pixel-wise depth estimation, which simultaneously extracts global and local features. A three-dimensional geometry smoothness loss is proposed to predict a geometrically natural depth map. In experiments on the KITTI and Make3D benchmarks, the proposed DLNet achieves performance competitive to those of the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-07T10:53:27Z)
S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation. It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.