Learning to Generate 3D Representations of Building Roofs Using
Single-View Aerial Imagery
- URL: http://arxiv.org/abs/2303.11215v1
- Date: Mon, 20 Mar 2023 15:47:05 GMT
- Title: Learning to Generate 3D Representations of Building Roofs Using
Single-View Aerial Imagery
- Authors: Maxim Khomiakov, Alejandro Valverde Mahou, Alba Reinders S\'anchez,
Jes Frellsen, Michael Riis Andersen
- Abstract summary: We present a novel pipeline for learning the conditional distribution of a building roof mesh given pixels from an aerial image.
Unlike alternative methods that require multiple images of the same object, our approach enables estimating 3D roof meshes using only a single image for predictions.
- Score: 68.3565370706598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel pipeline for learning the conditional distribution of a
building roof mesh given pixels from an aerial image, under the assumption that
roof geometry follows a set of regular patterns. Unlike alternative methods
that require multiple images of the same object, our approach enables
estimating 3D roof meshes using only a single image for predictions. The
approach employs the PolyGen, a deep generative transformer architecture for 3D
meshes. We apply this model in a new domain and investigate the sensitivity of
the image resolution. We propose a novel metric to evaluate the performance of
the inferred meshes, and our results show that the model is robust even at
lower resolutions, while qualitatively producing realistic representations for
out-of-distribution samples.
Related papers
- DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures.
In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process.
In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z) - GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement [51.97726804507328]
We propose a novel approach for 3D mesh reconstruction from multi-view images.
Our method takes inspiration from large reconstruction models that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images.
arXiv Detail & Related papers (2024-06-09T05:19:24Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - Flow-based GAN for 3D Point Cloud Generation from a Single Image [16.04710129379503]
We introduce a hybrid explicit-implicit generative modeling scheme, which inherits the flow-based explicit generative models for sampling point clouds with arbitrary resolutions.
We evaluate on the large-scale synthetic dataset ShapeNet, with the experimental results demonstrating the superior performance of the proposed method.
arXiv Detail & Related papers (2022-10-08T17:58:20Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Leveraging Monocular Disparity Estimation for Single-View Reconstruction [8.583436410810203]
We leverage advances in monocular depth estimation to obtain disparity maps.
We transform 2D normalized disparity maps into 3D point clouds by solving an optimization on the relevant camera parameters.
arXiv Detail & Related papers (2022-07-01T03:05:40Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Sim2Air - Synthetic aerial dataset for UAV monitoring [2.1638817206926855]
We propose to accentuate shape-based object representation by applying texture randomization.
A diverse dataset with photorealism in all parameters is created in a 3D modelling software Blender.
arXiv Detail & Related papers (2021-10-11T10:36:33Z) - Using Adaptive Gradient for Texture Learning in Single-View 3D
Reconstruction [0.0]
Learning-based approaches for 3D model reconstruction have attracted attention owing to its modern applications.
We present a novel sampling algorithm by optimizing the gradient of predicted coordinates based on the variance on the sampling image.
We also adopt Frechet Inception Distance (FID) to form a loss function in learning, which helps bridging the gap between rendered images and input images.
arXiv Detail & Related papers (2021-04-29T07:52:54Z) - An Effective Loss Function for Generating 3D Models from Single 2D Image
without Rendering [0.0]
Differentiable rendering is a very successful technique that applies to a Single-View 3D Reconstruction.
Currents use losses based on pixels between a rendered image of some 3D reconstructed object and ground-truth images from given matched viewpoints to optimise parameters of the 3D shape.
We propose a novel effective loss function that evaluates how well the projections of reconstructed 3D point clouds cover the ground truth object's silhouette.
arXiv Detail & Related papers (2021-03-05T00:02:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.