Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata
- URL: http://arxiv.org/abs/2406.08292v1
- Date: Wed, 12 Jun 2024 14:56:56 GMT
- Title: Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata
- Authors: Dongsu Zhang, Francis Williams, Zan Gojcic, Karsten Kreis, Sanja Fidler, Young Min Kim, Amlan Kar,
- Abstract summary: We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV)
We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
- Score: 70.9375320609781
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV). Contrary to prior work on AV scene completion, we aim to extrapolate fine geometry from unlabeled and beyond spatial limits of LiDAR scans, taking a step towards generating realistic, high-resolution simulation-ready 3D street environments. We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable conditional 3D generative model, which grows geometry recursively with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency. Experiments on synthetic scenes show that hGCA generates plausible scene geometry with higher fidelity and completeness compared to state-of-the-art baselines. Our model generalizes strongly from sim-to-real, qualitatively outperforming baselines on the Waymo-open dataset. We also show anecdotal evidence of the ability to create novel objects from real-world geometric cues even when trained on limited synthetic content. More results and details can be found on https://research.nvidia.com/labs/toronto-ai/hGCA/.
Related papers
- ArchComplete: Autoregressive 3D Architectural Design Generation with Hierarchical Diffusion-Based Upsampling [0.0]
ArchComplete is a two-stage voxel-based 3D generative pipeline consisting of a vector-quantised model.
Key to our pipeline is (i) learning a contextually rich codebook of local patch embeddings, optimised alongside a 2.5D perceptual loss.
ArchComplete autoregressively generates models at the resolution of $643$ and progressively refines them up to $5123$, with voxel sizes as small as $ approx 9textcm$.
arXiv Detail & Related papers (2024-12-23T20:13:27Z) - Gaussian Object Carver: Object-Compositional Gaussian Splatting with surfaces completion [16.379647695019308]
3D scene reconstruction is a foundational problem in computer vision.
We introduce the Gaussian Object Carver (GOC), a novel, efficient, and scalable framework for object-compositional 3D scene reconstruction.
GOC leverage 3D Gaussian Splatting (GS), enriched with monocular geometry priors and multi-view geometry regularization, to achieve high-quality and flexible reconstruction.
arXiv Detail & Related papers (2024-12-03T01:34:39Z) - DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures.
In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process.
In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z) - GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality [99.63429416013713]
3D-GS has achieved great success in reconstructing and rendering real-world scenes.
To transfer the high rendering quality to generation tasks, a series of research works attempt to generate 3D-Gaussian assets from text.
We propose a novel framework named GaussianDreamerPro to enhance the generation quality.
arXiv Detail & Related papers (2024-06-26T16:12:09Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions [22.077366472693395]
We introduce a new generative approach for synthesizing 3D geometry and images from single-view collections.
By employing volumetric rendering using neural radiance fields, they inherit a key limitation: the generated geometry is noisy and unconstrained.
We propose GeoGen, a new SDF-based 3D generative model trained in an end-to-end manner.
arXiv Detail & Related papers (2024-06-06T17:00:10Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Self-supervised Learning for Enhancing Geometrical Modeling in 3D-Aware
Generative Adversarial Network [42.16520614686877]
3D-GANs exhibit artifacts in their 3D geometrical modeling, such as mesh imperfections and holes.
These shortcomings are primarily attributed to the limited availability of annotated 3D data.
We present a Self-Supervised Learning technique tailored as an auxiliary loss for any 3D-GAN.
arXiv Detail & Related papers (2023-12-19T04:55:33Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z) - GLASS: Geometric Latent Augmentation for Shape Spaces [28.533018136138825]
We use geometrically motivated energies to augment and thus boost a sparse collection of example (training) models.
We analyze the Hessian of the as-rigid-as-possible (ARAP) energy to sample from and project to the underlying (local) shape space.
We present multiple examples of interesting and meaningful shape variations even when starting from as few as 3-10 training shapes.
arXiv Detail & Related papers (2021-08-06T17:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.