3D detection of roof sections from a single satellite image and
application to LOD2-building reconstruction
- URL: http://arxiv.org/abs/2307.05409v1
- Date: Tue, 11 Jul 2023 16:23:19 GMT
- Title: 3D detection of roof sections from a single satellite image and
application to LOD2-building reconstruction
- Authors: Johann Lussange, Mulin Yu, Yuliya Tarabalka, Florent Lafarge
- Abstract summary: We propose a method for urban 3D reconstruction named KIBS(textitKeypoints Inference By), which comprises two novel features.
We demonstrate the potential of the KIBS method by reconstructing different urban areas in a few minutes, with a Jaccard index for the 2D segmentation of individual roof sections of $88.55%$ and $75.21%$ on our two data sets resp., and a height's mean error of such correctly segmented pixels for the 3D reconstruction of $1.60$ m and $2.06 m on our two data sets resp., hence within the LOD
- Score: 12.693545159861857
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing urban areas in 3D out of satellite raster images has been a
long-standing and challenging goal of both academical and industrial research.
The rare methods today achieving this objective at a Level Of Details $2$ rely
on procedural approaches based on geometry, and need stereo images and/or LIDAR
data as input. We here propose a method for urban 3D reconstruction named
KIBS(\textit{Keypoints Inference By Segmentation}), which comprises two novel
features: i) a full deep learning approach for the 3D detection of the roof
sections, and ii) only one single (non-orthogonal) satellite raster image as
model input. This is achieved in two steps: i) by a Mask R-CNN model performing
a 2D segmentation of the buildings' roof sections, and after blending these
latter segmented pixels within the RGB satellite raster image, ii) by another
identical Mask R-CNN model inferring the heights-to-ground of the roof
sections' corners via panoptic segmentation, unto full 3D reconstruction of the
buildings and city. We demonstrate the potential of the KIBS method by
reconstructing different urban areas in a few minutes, with a Jaccard index for
the 2D segmentation of individual roof sections of $88.55\%$ and $75.21\%$ on
our two data sets resp., and a height's mean error of such correctly segmented
pixels for the 3D reconstruction of $1.60$ m and $2.06$ m on our two data sets
resp., hence within the LOD2 precision range.
Related papers
- SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations.
The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images.
Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z) - SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth
Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.
Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z) - sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite
Images [1.8884278918443564]
We propose sat2pc, a deep learning architecture that predicts the point of a building roof from a single 2D satellite image.
Our results show that sat2pc was able to outperform existing baselines by at least 18.6%.
arXiv Detail & Related papers (2022-05-25T03:24:40Z) - ImpliCity: City Modeling from Satellite Images with Deep Implicit
Occupancy Fields [20.00737387884824]
ImpliCity is a neural representation of the 3D scene as an implicit, continuous occupancy field, driven by learned embeddings of the point cloud and a stereo pair of ortho-photos.
With image resolution 0.5$,$m, ImpliCity reaches a median height error of $approx,$0.7$,$m and outperforms competing methods.
arXiv Detail & Related papers (2022-01-24T21:40:16Z) - 3D Instance Segmentation of MVS Buildings [5.2517244720510305]
We present a novel framework for instance segmentation of 3D buildings from Multi-view Stereo (MVS) urban scenes.
The emphasis of this work lies in detecting and segmenting 3D building instances even if they are attached and embedded in a large and imprecise 3D surface model.
arXiv Detail & Related papers (2021-12-18T11:12:38Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Machine-learned 3D Building Vectorization from Satellite Imagery [7.887221474814986]
We propose a machine learning based approach for automatic 3D building reconstruction and vectorization.
Taking a single-channel photogrammetric digital surface model (DSM) and panchromatic (PAN) image as input, we first filter out non-building objects and refine the building of shapes.
The refined DSM and the input PAN image are then used through a semantic segmentation network to detect edges and corners of building roofs.
arXiv Detail & Related papers (2021-04-13T19:57:30Z) - Learning Joint 2D-3D Representations for Depth Completion [90.62843376586216]
We design a simple yet effective neural network block that learns to extract joint 2D and 3D features.
Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points.
arXiv Detail & Related papers (2020-12-22T22:58:29Z) - 2D-3D Geometric Fusion Network using Multi-Neighbourhood Graph
Convolution for RGB-D Indoor Scene Classification [0.8629912408966145]
This paper presents a 2D-3D Fusion stage that combines 3D Geometric Features with 2D Texture Features.
Experimental results, using NYU-Depth-V2 and SUN RGB-D datasets, show that the proposed method outperforms the current state-of-the-art in RGB-D indoor scene classification task.
arXiv Detail & Related papers (2020-09-23T13:58:12Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.