Cumulative Assessment for Urban 3D Modeling
- URL: http://arxiv.org/abs/2107.04622v1
- Date: Fri, 9 Jul 2021 18:29:50 GMT
- Title: Cumulative Assessment for Urban 3D Modeling
- Authors: Shea Hagstrom, Hee Won Pak, Stephanie Ku, Sean Wang, Gregory Hager,
Myron Brown
- Abstract summary: Urban 3D modeling from satellite images requires accurate semantic segmentation to delineate urban features, multiple view stereo for 3D reconstruction of surface heights, and 3D model fitting to produce compact models with accurate surface slopes.
We present a cumulative assessment metric that succinctly captures error contributions from each of these components.
- Score: 0.8155575318208631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Urban 3D modeling from satellite images requires accurate semantic
segmentation to delineate urban features, multiple view stereo for 3D
reconstruction of surface heights, and 3D model fitting to produce compact
models with accurate surface slopes. In this work, we present a cumulative
assessment metric that succinctly captures error contributions from each of
these components. We demonstrate our approach by providing challenging public
datasets and extending two open source projects to provide an end-to-end 3D
modeling baseline solution to stimulate further research and evaluation with a
public leaderboard.
Related papers
- MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision [23.12838070960566]
We present MoGe, a powerful model for recovering 3D geometry from monocular open-domain images.
Given a single image, our model directly predicts a 3D point map of the captured scene with an affine-invariant representation.
We propose a set of novel global and local geometry supervisions that empower the model to learn high-quality geometry.
arXiv Detail & Related papers (2024-10-24T19:29:02Z) - Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation [32.30055363306321]
We propose a paradigm for seamlessly unifying different human pose and shape-related tasks and datasets.
Our formulation is centered on the ability - both at training and test time - to query any arbitrary point of the human volume.
We can naturally exploit differently annotated data sources including mesh, 2D/3D skeleton and dense pose, without having to convert between them.
arXiv Detail & Related papers (2024-07-10T10:44:18Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance [76.7746870349809]
We present ComboVerse, a 3D generation framework that produces high-quality 3D assets with complex compositions by learning to combine multiple models.
Our proposed framework emphasizes spatial alignment of objects, compared with standard score distillation sampling.
arXiv Detail & Related papers (2024-03-19T03:39:43Z) - 3D Face Reconstruction Using A Spectral-Based Graph Convolution Encoder [3.749406324648861]
We propose an innovative approach that integrates existing 2D features with 3D features to guide the model learning process.
Our model is trained using 2D-3D data pairs from a combination of datasets and achieves state-of-the-art performance on the NoW benchmark.
arXiv Detail & Related papers (2024-03-08T11:09:46Z) - Structured 3D Features for Reconstructing Controllable Avatars [43.36074729431982]
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.
We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation.
arXiv Detail & Related papers (2022-12-13T18:57:33Z) - Deep Generative Models on 3D Representations: A Survey [81.73385191402419]
Generative models aim to learn the distribution of observed data by generating new instances.
Recently, researchers started to shift focus from 2D to 3D space.
representing 3D data poses significantly greater challenges.
arXiv Detail & Related papers (2022-10-27T17:59:50Z) - UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body
Decoupling 3D Model [58.70130563417079]
We introduce a new 3D human-body model with a series of decoupled parameters that could freely control the generation of the body.
Compared to the existing manually annotated DensePose-COCO dataset, the synthetic UltraPose has ultra dense image-to-surface correspondences without annotation cost and error.
arXiv Detail & Related papers (2021-10-28T16:24:55Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Graph Stacked Hourglass Networks for 3D Human Pose Estimation [1.0660480034605242]
We propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks.
The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations.
arXiv Detail & Related papers (2021-03-30T14:25:43Z) - Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data [77.34069717612493]
We present a novel method for monocular hand shape and pose estimation at unprecedented runtime performance of 100fps.
This is enabled by a new learning based architecture designed such that it can make use of all the sources of available hand training data.
It features a 3D hand joint detection module and an inverse kinematics module which regresses not only 3D joint positions but also maps them to joint rotations in a single feed-forward pass.
arXiv Detail & Related papers (2020-03-21T03:51:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.