Deep Learning-based Scalable Image-to-3D Facade Parser for Generating Thermal 3D Building Models
- URL: http://arxiv.org/abs/2508.04406v1
- Date: Wed, 06 Aug 2025 12:48:53 GMT
- Title: Deep Learning-based Scalable Image-to-3D Facade Parser for Generating Thermal 3D Building Models
- Authors: Yinan Yu, Alex Gonzalez-Caceres, Samuel Scheidegger, Sanjay Somanath, Alexander Hollberg,
- Abstract summary: Early-phase renovation planning requires simulations based on thermal 3D models at Level of Detail (LoD) 3.<n>This paper presents a pipeline that generates LoD3 thermal models by extracting geometries from images using both computer vision and deep learning.<n> tested on typical Swedish residential buildings, SI3FP achieved approximately 5% error in window-to-wall ratio estimates.
- Score: 40.44787916333075
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Renovating existing buildings is essential for climate impact. Early-phase renovation planning requires simulations based on thermal 3D models at Level of Detail (LoD) 3, which include features like windows. However, scalable and accurate identification of such features remains a challenge. This paper presents the Scalable Image-to-3D Facade Parser (SI3FP), a pipeline that generates LoD3 thermal models by extracting geometries from images using both computer vision and deep learning. Unlike existing methods relying on segmentation and projection, SI3FP directly models geometric primitives in the orthographic image plane, providing a unified interface while reducing perspective distortions. SI3FP supports both sparse (e.g., Google Street View) and dense (e.g., hand-held camera) data sources. Tested on typical Swedish residential buildings, SI3FP achieved approximately 5% error in window-to-wall ratio estimates, demonstrating sufficient accuracy for early-stage renovation analysis. The pipeline facilitates large-scale energy renovation planning and has broader applications in urban development and planning.
Related papers
- E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models [78.1674905950243]
We present the first comprehensive benchmark for 3D geometric foundation models (GFMs)<n>GFMs directly predict dense 3D representations in a single feed-forward pass, eliminating the need for slow or unavailable precomputed camera parameters.<n>We evaluate 16 state-of-the-art GFMs, revealing their strengths and limitations across tasks and domains.<n>All code, evaluation scripts, and processed data will be publicly released to accelerate research in 3D spatial intelligence.
arXiv Detail & Related papers (2025-06-02T17:53:09Z) - Texture2LoD3: Enabling LoD3 Building Reconstruction With Panoramic Images [0.0]
In Texture2LoD3, we introduce a novel method leveraging the ubiquity of 3D building model priors and panoramic street-level images.<n>We experimentally demonstrate that our method leads to improved facade segmentation accuracy by 11%.<n>We believe that Texture2LoD3 can scale the adoption of LoD3 models, opening applications in estimating building solar potential or enhancing autonomous driving simulations.
arXiv Detail & Related papers (2025-04-07T16:40:16Z) - Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model [15.892685514932323]
We introduce Plane-DUSt3R, a novel method for multi-view room layout estimation.<n>Plane-DUSt3R incorporates the DUSt3R framework and fine-tunes on a room layout dataset (Structure3D) with a modified objective to estimate structural planes.<n>By generating uniform and parsimonious results, Plane-DUSt3R enables room layout estimation with only a single post-processing step and 2D detection results.
arXiv Detail & Related papers (2025-02-24T02:14:19Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.<n>Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - StructuredMesh: 3D Structured Optimization of Fa\c{c}ade Components on
Photogrammetric Mesh Models using Binary Integer Programming [17.985961236568663]
We present StructuredMesh, a novel approach for reconstructing faccade structures conforming to the regularity of buildings within photogrammetric mesh models.
Our method involves capturing multi-view color and depth images of the building model using a virtual camera.
We then utilize the depth image to remap these boxes into 3D space, generating an initial faccade layout.
arXiv Detail & Related papers (2023-06-07T06:40:54Z) - Learning to Generate 3D Representations of Building Roofs Using
Single-View Aerial Imagery [68.3565370706598]
We present a novel pipeline for learning the conditional distribution of a building roof mesh given pixels from an aerial image.
Unlike alternative methods that require multiple images of the same object, our approach enables estimating 3D roof meshes using only a single image for predictions.
arXiv Detail & Related papers (2023-03-20T15:47:05Z) - Elevation Estimation-Driven Building 3D Reconstruction from Single-View
Remote Sensing Imagery [20.001807614214922]
Building 3D reconstruction from remote sensing images has a wide range of applications in smart cities, photogrammetry and other fields.
We propose an efficient DSM estimation-driven reconstruction framework (Building3D) to reconstruct 3D building models from the input single-view remote sensing image.
Our Building3D is rooted in the SFFDE network for building elevation prediction, synchronized with a building extraction network for building masks, and then sequentially performs point cloud reconstruction, surface reconstruction (or CityGML model reconstruction)
arXiv Detail & Related papers (2023-01-11T17:20:30Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Translational Symmetry-Aware Facade Parsing for 3D Building
Reconstruction [11.263458202880038]
In this paper, we present a novel translational symmetry-based approach to improving the deep neural networks.
We propose a novel scheme to fuse anchor-free detection in a single stage network, which enables the efficient training and better convergence.
We employ an off-the-shelf rendering engine like Blender to reconstruct the realistic high-quality 3D models using procedural modeling.
arXiv Detail & Related papers (2021-06-02T03:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.