UrbanScene3D: A Large Scale Urban Scene Dataset and Simulator
- URL: http://arxiv.org/abs/2107.04286v1
- Date: Fri, 9 Jul 2021 07:56:46 GMT
- Title: UrbanScene3D: A Large Scale Urban Scene Dataset and Simulator
- Authors: Yilin Liu and Fuyou Xue and Hui Huang
- Abstract summary: We present a large scale urban scene dataset associated with a handy simulator based on Unreal Engine 4 and AirSim.
Unlike previous works that purely based on 2D information or man-made 3D CAD models, UrbanScene3D contains both compact man-made models and detailed real-world models reconstructed by aerial images.
- Score: 13.510431691480727
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to perceive the environments in different ways is essential to
robotic research. This involves the analysis of both 2D and 3D data sources. We
present a large scale urban scene dataset associated with a handy simulator
based on Unreal Engine 4 and AirSim, which consists of both man-made and
real-world reconstruction scenes in different scales, referred to as
UrbanScene3D. Unlike previous works that purely based on 2D information or
man-made 3D CAD models, UrbanScene3D contains both compact man-made models and
detailed real-world models reconstructed by aerial images. Each building has
been manually extracted from the entire scene model and then has been assigned
with a unique label, forming an instance segmentation map. The provided 3D
ground-truth textured models with instance segmentation labels in UrbanScene3D
allow users to obtain all kinds of data they would like to have: instance
segmentation map, depth map in arbitrary resolution, 3D point cloud/mesh in
both visible and invisible places, etc. In addition, with the help of AirSim,
users can also simulate the robots (cars/drones)to test a variety of autonomous
tasks in the proposed city environment. Please refer to our paper and
website(https://vcc.tech/UrbanScene3D/) for further details and applications.
Related papers
- SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR.
SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds.
We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z) - Urban Scene Diffusion through Semantic Occupancy Map [49.20779809250597]
UrbanDiffusion is a 3D diffusion model conditioned on a Bird's-Eye View (BEV) map.
Our model learns the data distribution of scene-level structures within a latent space.
After training on real-world driving datasets, our model can generate a wide range of diverse urban scenes.
arXiv Detail & Related papers (2024-03-18T11:54:35Z) - BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D
Scene Generation [96.58789785954409]
We propose a practical and efficient 3D representation that incorporates an equivariant radiance field with the guidance of a bird's-eye view map.
We produce large-scale, even infinite-scale, 3D scenes via synthesizing local scenes and then stitching them with smooth consistency.
arXiv Detail & Related papers (2023-12-04T18:56:10Z) - Model2Scene: Learning 3D Scene Representation via Contrastive
Language-CAD Models Pre-training [105.3421541518582]
Current successful methods of 3D scene perception rely on the large-scale annotated point cloud.
We propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages.
Model2Scene yields impressive label-free 3D object salient detection with an average mAP of 46.08% and 55.49% on the ScanNet and S3DIS datasets, respectively.
arXiv Detail & Related papers (2023-09-29T03:51:26Z) - Visual Localization using Imperfect 3D Models from the Internet [54.731309449883284]
This paper studies how imperfections in 3D models affect localization accuracy.
We show that 3D models from the Internet show promise as an easy-to-obtain scene representation.
arXiv Detail & Related papers (2023-04-12T16:15:05Z) - 3inGAN: Learning a 3D Generative Model from Images of a Self-similar
Scene [34.2144933185175]
3inGAN is an unconditional 3D generative model trained from 2D images of a single self-similar 3D scene.
We show results on semi-stochastic scenes of varying scale and complexity, obtained from real and synthetic sources.
arXiv Detail & Related papers (2022-11-27T18:03:21Z) - Gait Recognition in the Wild with Dense 3D Representations and A
Benchmark [86.68648536257588]
Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes.
This paper aims to explore dense 3D representations for gait recognition in the wild.
We build the first large-scale 3D representation-based gait recognition dataset, named Gait3D.
arXiv Detail & Related papers (2022-04-06T03:54:06Z) - Tracking Emerges by Looking Around Static Scenes, with Neural 3D Mapping [23.456046776979903]
We propose to leverage multiview data of textitstatic points in arbitrary scenes (static or dynamic) to learn a neural 3D mapping module.
The neural 3D mapper consumes RGB-D data as input, and produces a 3D voxel grid of deep features as output.
We show that our unsupervised 3D object trackers outperform prior unsupervised 2D and 2.5D trackers, and approach the accuracy of supervised trackers.
arXiv Detail & Related papers (2020-08-04T02:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.