Volumetric Semantically Consistent 3D Panoptic Mapping
- URL: http://arxiv.org/abs/2309.14737v3
- Date: Mon, 8 Jul 2024 08:29:08 GMT
- Title: Volumetric Semantically Consistent 3D Panoptic Mapping
- Authors: Yang Miao, Iro Armeni, Marc Pollefeys, Daniel Barath,
- Abstract summary: We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at generating semantic 3D maps suitable for autonomous agents in unstructured environments.
It introduces novel ways of integrating semantic prediction confidence during mapping, producing semantic and instance-consistent 3D regions.
The proposed method achieves accuracy superior to the state of the art on public large-scale datasets, improving on a number of widely used metrics.
- Score: 77.13446499924977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at generating comprehensive, accurate, and efficient semantic 3D maps suitable for autonomous agents in unstructured environments. The proposed approach is based on a Voxel-TSDF representation used in recent algorithms. It introduces novel ways of integrating semantic prediction confidence during mapping, producing semantic and instance-consistent 3D regions. Further improvements are achieved by graph optimization-based semantic labeling and instance refinement. The proposed method achieves accuracy superior to the state of the art on public large-scale datasets, improving on a number of widely used metrics. We also highlight a downfall in the evaluation of recent studies: using the ground truth trajectory as input instead of a SLAM-estimated one substantially affects the accuracy, creating a large gap between the reported results and the actual performance on real-world data.
Related papers
- MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification [13.872254142378772]
This paper introduces a unified framework for text-to-3D content generation.
Our approach utilizes multi-view guidance to iteratively form the structure of the 3D model.
We also introduce a novel densification algorithm that aligns gaussians close to the surface.
arXiv Detail & Related papers (2024-09-10T16:16:34Z) - CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding [32.76277160013881]
We present CLIP-GS, which integrates semantics from Contrastive Language-Image Pre-Training (CLIP) into Gaussian Splatting.
SAC exploits the inherent unified semantics within objects to learn compact yet effective semantic representations of 3D Gaussians.
We also introduce a 3D Coherent Self-training (3DCS) strategy, resorting to the multi-view consistency originated from the 3D model.
arXiv Detail & Related papers (2024-04-22T15:01:32Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - Neural Semantic Surface Maps [52.61017226479506]
We present an automated technique for computing a map between two genus-zero shapes, which matches semantically corresponding regions to one another.
Our approach can generate semantic surface-to-surface maps, eliminating manual annotations or any 3D training data requirement.
arXiv Detail & Related papers (2023-09-09T16:21:56Z) - Convolutional Bayesian Kernel Inference for 3D Semantic Mapping [1.7615233156139762]
We introduce a Convolutional Bayesian Kernel Inference layer which learns to perform explicit Bayesian inference.
We learn semantic-geometric probability distributions for LiDAR sensor information and incorporate semantic predictions into a global map.
We evaluate our network against state-of-the-art semantic mapping algorithms on the KITTI data set, demonstrating improved latency with comparable semantic label inference results.
arXiv Detail & Related papers (2022-09-21T21:15:12Z) - 3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling [37.315964084413174]
We develop a domain adaptation framework via generating reliable pseudo ground truths of depth from real data to provide direct supervisions.
Specifically, we propose two mechanisms for pseudo-labeling: 1) 2D-based pseudo-labels via measuring the consistency of depth predictions when images are with the same content but different styles; 2) 3D-aware pseudo-labels via a point cloud completion network that learns to complete the depth values in the 3D space.
arXiv Detail & Related papers (2022-09-19T17:54:17Z) - From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction [80.67873933010783]
We argue that MDP is currently witnessing benchmark over-fitting and relying on metrics that are only partially helpful to gauge the usefulness of the predictions for 3D applications.
This limits the design and development of novel methods that are truly aware of - and improving towards estimating - the 3D structure of the scene rather than optimizing 2D-based distances.
We propose a set of metrics well suited to evaluate the 3D geometry of MDP approaches and a novel indoor benchmark, RIO-D3D, crucial for the proposed evaluation methodology.
arXiv Detail & Related papers (2022-03-15T17:50:54Z) - Dynamic Semantic Occupancy Mapping using 3D Scene Flow and Closed-Form
Bayesian Inference [3.0389083199673337]
We leverage state-of-the-art semantic segmentation and 3D flow estimation using deep learning to provide measurements for map inference.
We develop a continuous (i.e., can be queried at arbitrary resolution) Bayesian model that propagates the scene with flow and infers a 3D semantic occupancy map with better performance than its static counterpart.
arXiv Detail & Related papers (2021-08-06T15:51:40Z) - SCFusion: Real-time Incremental Scene Reconstruction with Semantic
Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner.
Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.