Related papers: NeoWorld: Neural Simulation of Explorable Virtual Worlds via Progressive 3D Unfolding

NeoWorld: Neural Simulation of Explorable Virtual Worlds via Progressive 3D Unfolding

URL: http://arxiv.org/abs/2509.24441v1
Date: Mon, 29 Sep 2025 08:24:28 GMT
Title: NeoWorld: Neural Simulation of Explorable Virtual Worlds via Progressive 3D Unfolding
Authors: Yanpeng Zhao, Shanyan Guan, Yunbo Wang, Yanhao Ge, Wei Li, Xiaokang Yang,
Abstract summary: We introduce NeoWorld, a framework for generating interactive 3D virtual worlds from a single input image.<n>Inspired by the on-demand worldbuilding concept in the science fiction novel Simulacron-3 (1964), our system constructs expansive environments.
Score: 46.79724166827757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce NeoWorld, a deep learning framework for generating interactive 3D virtual worlds from a single input image. Inspired by the on-demand worldbuilding concept in the science fiction novel Simulacron-3 (1964), our system constructs expansive environments where only the regions actively explored by the user are rendered with high visual realism through object-centric 3D representations. Unlike previous approaches that rely on global world generation or 2D hallucination, NeoWorld models key foreground objects in full 3D, while synthesizing backgrounds and non-interacted regions in 2D to ensure efficiency. This hybrid scene structure, implemented with cutting-edge representation learning and object-to-3D techniques, enables flexible viewpoint manipulation and physically plausible scene animation, allowing users to control object appearance and dynamics using natural language commands. As users interact with the environment, the virtual world progressively unfolds with increasing 3D detail, delivering a dynamic, immersive, and visually coherent exploration experience. NeoWorld significantly outperforms existing 2D and depth-layered 2.5D methods on the WorldScore benchmark.

Related papers

Beyond Pixel Histories: World Models with Persistent 3D State [50.4601060508243]
PERSIST is a new paradigm of world model which simulates the evolution of a latent 3D scene.<n>We show substantial improvements in spatial memory, 3D consistency, and long-horizon stability over existing methods.
arXiv Detail & Related papers (2026-03-03T19:58:31Z)
SceMoS: Scene-Aware 3D Human Motion Synthesis by Planning with Geometry-Grounded Tokens [89.05195827071582]
SceMoS is a scene-aware motion synthesis framework.<n>It disentangles global planning from local execution using lightweight 2D cues.<n>SceMoS achieves state-of-the-art motion realism and contact accuracy on the TRUMANS benchmark.
arXiv Detail & Related papers (2026-02-24T02:09:12Z)
WorldGen: From Text to Traversable and Interactive 3D Worlds [87.95088818329403]
We introduce WorldGen, a system that enables the automatic creation of large-scale, interactive 3D worlds directly from text prompts.<n>Our approach transforms natural language descriptions into fully textured environments that can be immediately explored or edited within standard game engines.<n>This work represents a step towards accessible, generative world-building at scale, advancing the frontier of 3D generative AI for applications in gaming, simulation, and immersive social environments.
arXiv Detail & Related papers (2025-11-20T22:13:18Z)
WorldGrow: Generating Infinite 3D World [75.81531067447203]
We tackle the challenge of generating the infinitely extendable 3D world -- large, continuous environments with coherent geometry and realistic appearance.<n>We propose WorldGrow, a hierarchical framework for unbounded 3D scene synthesis.<n>Our method features three core components: (1) a data curation pipeline that extracts high-quality scene blocks for training, making the 3D structured latent representations suitable for scene generation; (2) a 3D block inpainting mechanism that enables context-aware scene extension; and (3) a coarse-to-fine generation strategy that ensures both global layout plausibility and local geometric/textural fidelity.
arXiv Detail & Related papers (2025-10-24T17:39:52Z)
Terra: Explorable Native 3D World Model with Point Latents [74.90179419859415]
We present Terra, a native 3D world model that represents and generates explorable environments in an intrinsic 3D latent space.<n>Specifically, we propose a novel point-to-Gaussian variational autoencoder (P2G-VAE) that encodes 3D inputs into a latent point representation.<n>We then introduce a sparse point flow matching network (SPFlow) for generating the latent point representation, which simultaneously denoises the positions and features of the point latents.
arXiv Detail & Related papers (2025-10-16T17:59:56Z)
LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation [35.4193352348583]
We propose a simple yet effective 3D world generation framework that streamlines the industrial production pipeline of 3D environments.<n>LatticeWorld creates large-scale 3D interactive worlds with dynamic agents, featuring competitive multi-agent interaction.<n>LatticeWorld achieves over a $90times$ increase in industrial production efficiency.
arXiv Detail & Related papers (2025-09-05T17:22:33Z)
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels [30.986527559921335]
HunyuanWorld 1.0 is a novel framework that combines the best of both worlds for generating immersive, explorable, and interactive 3D scenes from text and image conditions.<n>Our approach features three key advantages: 1) 360deg immersive experiences via panoramic world proxies; 2) mesh export capabilities for seamless compatibility with existing computer graphics pipelines; 3) disentangled object representations for augmented interactivity.
arXiv Detail & Related papers (2025-07-29T13:43:35Z)
SynCity: Training-Free Generation of 3D Worlds [107.69875149880679]
We propose SynCity, a training- and optimization-free approach to generating 3D worlds from textual descriptions.<n>We show how 3D and 2D generators can be combined to generate ever-expanding scenes.
arXiv Detail & Related papers (2025-03-20T17:59:40Z)
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.<n>voxelization infers per-object occupancy probabilities at individual spatial locations.<n>Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z)
WonderWorld: Interactive 3D Scene Generation from a Single Image [38.83667648993784]
We present WonderWorld, a novel framework for interactive 3D scene generation.<n>WonderWorld generates connected and diverse 3D scenes in less than 10 seconds on a single A6000 GPU.
arXiv Detail & Related papers (2024-06-13T17:59:10Z)
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline. Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement. Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z)
Self-supervised novel 2D view synthesis of large-scale scenes with efficient multi-scale voxel carving [77.07589573960436]
We introduce an efficient multi-scale voxel carving method to generate novel views of real scenes. Our final high-resolution output is efficiently self-trained on data automatically generated by the voxel carving module. We demonstrate the effectiveness of our method on highly complex and large-scale scenes in real environments.
arXiv Detail & Related papers (2023-06-26T13:57:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.