ImpliCity: City Modeling from Satellite Images with Deep Implicit
Occupancy Fields
- URL: http://arxiv.org/abs/2201.09968v1
- Date: Mon, 24 Jan 2022 21:40:16 GMT
- Title: ImpliCity: City Modeling from Satellite Images with Deep Implicit
Occupancy Fields
- Authors: Corinne Stucker, Bingxin Ke, Yuanwen Yue, Shengyu Huang, Iro Armeni,
Konrad Schindler
- Abstract summary: ImpliCity is a neural representation of the 3D scene as an implicit, continuous occupancy field, driven by learned embeddings of the point cloud and a stereo pair of ortho-photos.
With image resolution 0.5$,$m, ImpliCity reaches a median height error of $approx,$0.7$,$m and outperforms competing methods.
- Score: 20.00737387884824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-resolution optical satellite sensors, in combination with dense stereo
algorithms, have made it possible to reconstruct 3D city models from space.
However, the resulting models are, in practice, rather noisy, and they tend to
miss small geometric features that are clearly visible in the images. We argue
that one reason for the limited DSM quality may be a too early, heuristic
reduction of the triangulated 3D point cloud to an explicit height field or
surface mesh. To make full use of the point cloud and the underlying images, we
introduce ImpliCity, a neural representation of the 3D scene as an implicit,
continuous occupancy field, driven by learned embeddings of the point cloud and
a stereo pair of ortho-photos. We show that this representation enables the
extraction of high-quality DSMs: with image resolution 0.5$\,$m, ImpliCity
reaches a median height error of $\approx\,$0.7$\,$m and outperforms competing
methods, especially w.r.t. building reconstruction, featuring intricate roof
details, smooth surfaces, and straight, regular outlines.
Related papers
- DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images.
Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data.
We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z) - Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery [19.67372661944804]
We construct a 3D Gaussian Splatting model of the Waterloo region centered on the University of Waterloo.
We are able to achieve view-synthesis results far exceeding previous 3D view-synthesis results based on neural radiance fields.
arXiv Detail & Related papers (2024-05-17T18:00:07Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite
Images [1.8884278918443564]
We propose sat2pc, a deep learning architecture that predicts the point of a building roof from a single 2D satellite image.
Our results show that sat2pc was able to outperform existing baselines by at least 18.6%.
arXiv Detail & Related papers (2022-05-25T03:24:40Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object
Detection [101.20784125067559]
We propose a new architecture, namely Hallucinated Hollow-3D R-CNN, to address the problem of 3D object detection.
In our approach, we first extract the multi-view features by sequentially projecting the point clouds into the perspective view and the bird-eye view.
The 3D objects are detected via a box refinement module with a novel Hierarchical Voxel RoI Pooling operation.
arXiv Detail & Related papers (2021-07-30T02:00:06Z) - ResDepth: A Deep Prior For 3D Reconstruction From High-resolution
Satellite Images [28.975837416508142]
We introduce ResDepth, a convolutional neural network that learns such an expressive geometric prior from example data.
In a series of experiments, we find that the proposed method consistently improves stereo DSMs both quantitatively and qualitatively.
We show that the prior encoded in the network weights captures meaningful geometric characteristics of urban design.
arXiv Detail & Related papers (2021-06-15T12:51:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.