InfiniCity: Infinite-Scale City Synthesis
- URL: http://arxiv.org/abs/2301.09637v2
- Date: Tue, 15 Aug 2023 01:05:21 GMT
- Title: InfiniCity: Infinite-Scale City Synthesis
- Authors: Chieh Hubert Lin, Hsin-Ying Lee, Willi Menapace, Menglei Chai,
Aliaksandr Siarohin, Ming-Hsuan Yang and Sergey Tulyakov
- Abstract summary: We propose a novel framework, InfiniCity, which constructs and renders an unconstrainedly large and 3D-grounded environment from random noises.
An infinite-pixel image synthesis module generates arbitrary-scale 2D maps from the bird's-eye view.
An octree-based voxel completion module lifts the generated 2D map to 3D octrees.
A voxel-based neural rendering module texturizes the voxels and renders 2D images.
- Score: 101.87428043837242
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Toward infinite-scale 3D city synthesis, we propose a novel framework,
InfiniCity, which constructs and renders an unconstrainedly large and
3D-grounded environment from random noises. InfiniCity decomposes the seemingly
impractical task into three feasible modules, taking advantage of both 2D and
3D data. First, an infinite-pixel image synthesis module generates
arbitrary-scale 2D maps from the bird's-eye view. Next, an octree-based voxel
completion module lifts the generated 2D map to 3D octrees. Finally, a
voxel-based neural rendering module texturizes the voxels and renders 2D
images. InfiniCity can thus synthesize arbitrary-scale and traversable 3D city
environments, and allow flexible and interactive editing from users. We
quantitatively and qualitatively demonstrate the efficacy of the proposed
framework. Project page: https://hubert0527.github.io/infinicity/
Related papers
- OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos [7.616167860385134]
It has long been challenging to recover the underlying dynamic 3D scene representations from a monocular RGB video.
We introduce a new framework, called OSN, to learn all plausible 3D scene configurations that match the input video.
Our method demonstrates a clear advantage in learning fine-grained 3D scene geometry.
arXiv Detail & Related papers (2024-07-08T05:03:46Z) - GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation [44.203932215464214]
3D Gaussian Splatting (3D-GS) has emerged as a highly efficient alternative for object-level 3D generation.
However, adapting 3D-GS from finite-scale 3D objects and humans to infinite-scale 3D cities is non-trivial.
We propose a generative Gaussian Splatting framework dedicated to efficiently synthesizing 3D cities with a single feed-forward pass.
arXiv Detail & Related papers (2024-06-10T17:59:55Z) - BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D
Scene Generation [96.58789785954409]
We propose a practical and efficient 3D representation that incorporates an equivariant radiance field with the guidance of a bird's-eye view map.
We produce large-scale, even infinite-scale, 3D scenes via synthesizing local scenes and then stitching them with smooth consistency.
arXiv Detail & Related papers (2023-12-04T18:56:10Z) - MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering
and Beyond [69.37319723095746]
We build a large-scale, comprehensive, and high-quality synthetic dataset for city-scale neural rendering researches.
We develop a pipeline to easily collect aerial and street city views, accompanied by ground-truth camera poses and a range of additional data modalities.
The resulting pilot dataset, MatrixCity, contains 67k aerial images and 452k street images from two city maps of total size $28km2$.
arXiv Detail & Related papers (2023-09-28T16:06:02Z) - Infinite Photorealistic Worlds using Procedural Generation [135.10236145573043]
Infinigen is a procedural generator of photorealistic 3D scenes of the natural world.
Every asset, from shape to texture, is generated from scratch via randomized mathematical rules.
arXiv Detail & Related papers (2023-06-15T17:46:16Z) - VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene
Completion [129.5975573092919]
VoxFormer is a Transformer-based semantic scene completion framework.
It can output complete 3D semantics from only 2D images.
Our framework outperforms the state of the art with a relative improvement of 20.0% in geometry and 18.1% in semantics.
arXiv Detail & Related papers (2023-02-23T18:59:36Z) - 3D-aware Image Synthesis via Learning Structural and Textural
Representations [39.681030539374994]
We propose VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation.
Our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
arXiv Detail & Related papers (2021-12-20T18:59:40Z) - Novel-View Human Action Synthesis [39.72702883597454]
We present a novel 3D reasoning to synthesize the target viewpoint.
We first estimate the 3D mesh of the target body and transfer the rough textures from the 2D images to the mesh.
We produce a semi-dense textured mesh by propagating the transferred textures both locally, within local geodesic neighborhoods, and globally.
arXiv Detail & Related papers (2020-07-06T15:11:51Z) - 3D Human Mesh Regression with Dense Correspondence [95.92326689172877]
Estimating 3D mesh of the human body from a single 2D image is an important task with many applications such as augmented reality and Human-Robot interaction.
Prior works reconstructed 3D mesh from global image feature extracted by using convolutional neural network (CNN), where the dense correspondences between the mesh surface and the image pixels are missing.
This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space.
arXiv Detail & Related papers (2020-06-10T08:50:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.