Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene
Segmentation
- URL: http://arxiv.org/abs/2203.15224v1
- Date: Tue, 29 Mar 2022 04:16:40 GMT
- Title: Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene
Segmentation
- Authors: Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Lanyun Zhu,
Xiaowei Zhou, Andreas Geiger, Yiyi Liao
- Abstract summary: We present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels.
By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design.
- Score: 48.677336052620895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale training data with high-quality annotations is critical for
training semantic and instance segmentation models. Unfortunately, pixel-wise
annotation is labor-intensive and costly, raising the demand for more efficient
labeling strategies. In this work, we present a novel 3D-to-2D label transfer
method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and
instance labels from easy-to-obtain coarse 3D bounding primitives. Our method
utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D
semantic cues transferred from existing datasets. We demonstrate that this
combination allows for improved geometry guided by semantic information,
enabling rendering of accurate semantic maps across multiple views.
Furthermore, this fusion process resolves label ambiguity of the coarse 3D
annotations and filters noise in the 2D predictions. By inferring in 3D space
and rendering to 2D labels, our 2D semantic and instance labels are multi-view
consistent by design. Experimental results show that Panoptic NeRF outperforms
existing semantic and instance label transfer methods in terms of accuracy and
multi-view consistency on challenging urban scenes of the KITTI-360 dataset.
Related papers
- LeAP: Consistent multi-domain 3D labeling using Foundation Models [0.7919810878571297]
This work introduces Label Any Pointcloud (LeAP), leveraging 2D VFMs to automatically label 3D data with any set of classes in any kind of application.
We show that our method can generate high-quality 3D semantic labels across diverse fields without any manual labeling.
arXiv Detail & Related papers (2025-02-06T09:24:47Z) - Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding [59.51535163599723]
FreeGS is an unsupervised semantic-embedded 3DGS framework that achieves view-consistent 3D scene understanding without the need for 2D labels.
We show that FreeGS performs comparably to state-of-the-art methods while avoiding the complex data preprocessing workload.
arXiv Detail & Related papers (2024-11-29T08:52:32Z) - Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance [72.6809373191638]
We propose a framework to study how to leverage constraints between 2D and 3D domains without requiring any 3D labels.
Specifically, we design a feature-level constraint to align LiDAR and image features based on object-aware regions.
Second, the output-level constraint is developed to enforce the overlap between 2D and projected 3D box estimations.
Third, the training-level constraint is utilized by producing accurate and consistent 3D pseudo-labels that align with the visual data.
arXiv Detail & Related papers (2023-12-12T18:57:25Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes [54.49897326605168]
We present PanopticNeRF-360, a novel approach that combines coarse 3D annotations with noisy 2D semantic cues to generate high-quality panoptic labels.
Our experiments demonstrate PanopticNeRF-360's state-of-the-art performance over label transfer methods on the KITTI-360 dataset.
arXiv Detail & Related papers (2023-09-19T17:54:22Z) - SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations.
The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images.
Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z) - Learning 3D Semantic Segmentation with only 2D Image Supervision [18.785840615548473]
We train a 3D model from pseudo-labels derived from 2D semantic image segmentations using multiview fusion.
The proposed network architecture, 2D3DNet, achieves significantly better performance than baselines during experiments on a new urban dataset with lidar and images captured in 20 cities across 5 continents.
arXiv Detail & Related papers (2021-10-21T17:56:28Z) - 3D Guided Weakly Supervised Semantic Segmentation [27.269847900950943]
We propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information.
We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks.
arXiv Detail & Related papers (2020-12-01T03:34:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.