Persistent Nature: A Generative Model of Unbounded 3D Worlds
- URL: http://arxiv.org/abs/2303.13515v1
- Date: Thu, 23 Mar 2023 17:59:40 GMT
- Title: Persistent Nature: A Generative Model of Unbounded 3D Worlds
- Authors: Lucy Chai, Richard Tucker, Zhengqi Li, Phillip Isola, Noah Snavely
- Abstract summary: We present an extendable, planar scene layout grid that can be rendered from arbitrary camera poses via a 3D decoder and volume rendering.
Based on this representation, we learn a generative world model solely from single-view internet photos.
Our approach enables scene extrapolation beyond the fixed bounds of current 3D generative models, while also supporting a persistent, camera-independent world representation.
- Score: 74.51149070418002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite increasingly realistic image quality, recent 3D image generative
models often operate on 3D volumes of fixed extent with limited camera motions.
We investigate the task of unconditionally synthesizing unbounded nature
scenes, enabling arbitrarily large camera motion while maintaining a persistent
3D world model. Our scene representation consists of an extendable, planar
scene layout grid, which can be rendered from arbitrary camera poses via a 3D
decoder and volume rendering, and a panoramic skydome. Based on this
representation, we learn a generative world model solely from single-view
internet photos. Our method enables simulating long flights through 3D
landscapes, while maintaining global scene consistency--for instance, returning
to the starting point yields the same view of the scene. Our approach enables
scene extrapolation beyond the fixed bounds of current 3D generative models,
while also supporting a persistent, camera-independent world representation
that stands in contrast to auto-regressive 3D prediction models. Our project
page: https://chail.github.io/persistent-nature/.
Related papers
- Generating 3D-Consistent Videos from Unposed Internet Photos [68.944029293283]
We train a scalable, 3D-aware video model without any 3D annotations such as camera parameters.
Our results suggest that we can scale up scene-level 3D learning using only 2D data such as videos and multiview internet photos.
arXiv Detail & Related papers (2024-11-20T18:58:31Z) - CAT3D: Create Anything in 3D with Multi-View Diffusion Models [87.80820708758317]
We present CAT3D, a method for creating anything in 3D by simulating this real-world capture process with a multi-view diffusion model.
CAT3D can create entire 3D scenes in as little as one minute, and outperforms existing methods for single image and few-view 3D scene creation.
arXiv Detail & Related papers (2024-05-16T17:59:05Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - 3inGAN: Learning a 3D Generative Model from Images of a Self-similar
Scene [34.2144933185175]
3inGAN is an unconditional 3D generative model trained from 2D images of a single self-similar 3D scene.
We show results on semi-stochastic scenes of varying scale and complexity, obtained from real and synthetic sources.
arXiv Detail & Related papers (2022-11-27T18:03:21Z) - CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields [67.76151996543588]
We learn a 3D- and camera-aware generative model which faithfully recovers not only the image but also the camera data distribution.
At test time, our model generates images with explicit control over the camera as well as the shape and appearance of the scene.
arXiv Detail & Related papers (2021-03-31T17:59:24Z) - Unsupervised object-centric video generation and decomposition in 3D [36.08064849807464]
We propose to model a video as the view seen while moving through a scene with multiple 3D objects and a 3D background.
Our model is trained from monocular videos without any supervision, yet learns to generate coherent 3D scenes containing several moving objects.
arXiv Detail & Related papers (2020-07-07T18:01:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.