Related papers: Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting

Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting

URL: http://arxiv.org/abs/2501.00625v2
Date: Tue, 07 Jan 2025 16:49:29 GMT
Title: Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting
Authors: Kyle Gao, Liangzhi Li, Hongjie He, Dening Lu, Linlin Xu, Jonathan Li,
Abstract summary: Recently released open-source pre-trained foundational image segmentation and object detection models (SAM2+GroundingDINO)<n>We created a pipeline to extract the 3D mesh of any building based on its name, address, or geographic coordinates.
Score: 19.410739991928704
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently released open-source pre-trained foundational image segmentation and object detection models (SAM2+GroundingDINO) allow for geometrically consistent segmentation of objects of interest in multi-view 2D images. Users can use text-based or click-based prompts to segment objects of interest without requiring labeled training datasets. Gaussian Splatting allows for the learning of the 3D representation of a scene's geometry and radiance based on 2D images. Combining Google Earth Studio, SAM2+GroundingDINO, 2D Gaussian Splatting, and our improvements in mask refinement based on morphological operations and contour simplification, we created a pipeline to extract the 3D mesh of any building based on its name, address, or geographic coordinates.

Related papers

PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM [105.01907579424362]
PanoSLAM is the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework. For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video.
arXiv Detail & Related papers (2024-12-31T08:58:10Z)
SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition [66.80822249039235]
3D Gaussian Splatting has emerged as an alternative 3D representation for novel view synthesis. We propose SAGD, a conceptually simple yet effective boundary-enhanced segmentation pipeline for 3D-GS. Our approach achieves high-quality 3D segmentation without rough boundary issues, which can be easily applied to other scene editing tasks.
arXiv Detail & Related papers (2024-01-31T14:19:03Z)
Gaussian Grouping: Segment and Edit Anything in 3D Scenes [65.49196142146292]
We propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes. Compared to the implicit NeRF representation, we show that the grouped 3D Gaussians can reconstruct, segment and edit anything in 3D with high visual quality, fine granularity and efficiency.
arXiv Detail & Related papers (2023-12-01T17:09:31Z)
SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations. The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images. Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z)
Pixel-Aligned Non-parametric Hand Mesh Reconstruction [16.62199923065314]
Non-parametric mesh reconstruction has recently shown significant progress in 3D hand and body applications. In this paper, we seek to establish and exploit this mapping with a simple and compact architecture. We propose an end-to-end pipeline for hand mesh recovery tasks which consists of three phases.
arXiv Detail & Related papers (2022-10-17T15:53:18Z)
SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape. Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z)
TerrainMesh: Metric-Semantic Terrain Reconstruction from Aerial Images Using Joint 2D-3D Learning [20.81202315793742]
This paper develops a joint 2D-3D learning approach to reconstruct a local metric-semantic mesh at each camera maintained by a visual odometry algorithm. The mesh can be assembled into a global environment model to capture the terrain topology and semantics during online operation.
arXiv Detail & Related papers (2022-04-23T05:18:39Z)
GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed Silhouettes [48.642181362172906]
We present an algorithm that learns a coarse 3D representation of objects from unposed multi-view 2D mask supervision. In contrast to existing voxel-based methods for unposed object reconstruction, our approach learns to represent the generated shape and pose. We show results on synthetic datasets with realistic lighting, and demonstrate object insertion with interactive posing.
arXiv Detail & Related papers (2021-06-24T17:47:58Z)
Machine-learned 3D Building Vectorization from Satellite Imagery [7.887221474814986]
We propose a machine learning based approach for automatic 3D building reconstruction and vectorization. Taking a single-channel photogrammetric digital surface model (DSM) and panchromatic (PAN) image as input, we first filter out non-building objects and refine the building of shapes. The refined DSM and the input PAN image are then used through a semantic segmentation network to detect edges and corners of building roofs.
arXiv Detail & Related papers (2021-04-13T19:57:30Z)
Learning Geometry-Disentangled Representation for Complementary Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing. GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components. Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z)
Improved Modeling of 3D Shapes with Multi-view Depth Maps [48.8309897766904]
We present a general-purpose framework for modeling 3D shapes using CNNs. Using just a single depth image of the object, we can output a dense multi-view depth map representation of 3D objects.
arXiv Detail & Related papers (2020-09-07T17:58:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.