Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery
- URL: http://arxiv.org/abs/2403.11812v1
- Date: Mon, 18 Mar 2024 14:15:39 GMT
- Title: Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery
- Authors: Yuqi Zhang, Guanying Chen, Jiaxing Chen, Shuguang Cui,
- Abstract summary: We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images.
objects in urban aerial images exhibit substantial variations in size, including buildings, cars, and roads.
We introduce a scale-adaptive semantic label fusion strategy that enhances the segmentation of objects of varying sizes.
We then introduce a novel cross-view instance label grouping strategy to mitigate the multi-view inconsistency problem in the 2D instance labels.
- Score: 51.73680703579997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D. This is a challenging problem due to two primary reasons. Firstly, objects in urban aerial images exhibit substantial variations in size, including buildings, cars, and roads, which pose a significant challenge for accurate 2D segmentation. Secondly, the 2D labels generated by existing segmentation methods suffer from the multi-view inconsistency problem, especially in the case of aerial images, where each image captures only a small portion of the entire scene. To overcome these limitations, we first introduce a scale-adaptive semantic label fusion strategy that enhances the segmentation of objects of varying sizes by combining labels predicted from different altitudes, harnessing the novel-view synthesis capabilities of NeRF. We then introduce a novel cross-view instance label grouping strategy based on the 3D scene representation to mitigate the multi-view inconsistency problem in the 2D instance labels. Furthermore, we exploit multi-view reconstructed depth priors to improve the geometric quality of the reconstructed radiance field, resulting in enhanced segmentation results. Experiments on multiple real-world urban-scale datasets demonstrate that our approach outperforms existing methods, highlighting its effectiveness.
Related papers
- Cross-modal and Cross-domain Knowledge Transfer for Label-free 3D
Segmentation [23.110443633049382]
We propose a novel approach for the challenging cross-modal and cross-domain adaptation task by fully exploring the relationship between images and point clouds.
Our method achieves state-of-the-art performance for 3D point cloud semantic segmentation on Semantic KITTI by using the knowledge of KITTI360 and GTA5.
arXiv Detail & Related papers (2023-09-19T14:29:57Z) - Vox-E: Text-guided Voxel Editing of 3D Objects [14.88446525549421]
Large scale text-guided diffusion models have garnered significant attention due to their ability to synthesize diverse images.
We present a technique that harnesses the power of latent diffusion models for editing existing 3D objects.
arXiv Detail & Related papers (2023-03-21T17:36:36Z) - Unsupervised Multi-View Object Segmentation Using Radiance Field
Propagation [55.9577535403381]
We present a novel approach to segmenting objects in 3D during reconstruction given only unlabeled multi-view images of a scene.
The core of our method is a novel propagation strategy for individual objects' radiance fields with a bidirectional photometric loss.
To the best of our knowledge, RFP is the first unsupervised approach for tackling 3D scene object segmentation for neural radiance field (NeRF)
arXiv Detail & Related papers (2022-10-02T11:14:23Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene
Segmentation [48.677336052620895]
We present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels.
By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design.
arXiv Detail & Related papers (2022-03-29T04:16:40Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Learning 3D Semantic Segmentation with only 2D Image Supervision [18.785840615548473]
We train a 3D model from pseudo-labels derived from 2D semantic image segmentations using multiview fusion.
The proposed network architecture, 2D3DNet, achieves significantly better performance than baselines during experiments on a new urban dataset with lidar and images captured in 20 cities across 5 continents.
arXiv Detail & Related papers (2021-10-21T17:56:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.