Unrolling Virtual Worlds for Immersive Experiences
- URL: http://arxiv.org/abs/2311.17924v1
- Date: Tue, 14 Nov 2023 13:16:34 GMT
- Title: Unrolling Virtual Worlds for Immersive Experiences
- Authors: Alexey Tikhonov and Anton Repushko
- Abstract summary: This research pioneers a method for generating immersive worlds, drawing inspiration from elements of vintage adventure games like Myst.
We explore the intricate conversion of 2D panoramas into 3D scenes using equirectangular projections.
- Score: 13.615681132633561
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This research pioneers a method for generating immersive worlds, drawing
inspiration from elements of vintage adventure games like Myst and employing
modern text-to-image models. We explore the intricate conversion of 2D
panoramas into 3D scenes using equirectangular projections, addressing the
distortions in perception that occur as observers navigate within the
encompassing sphere. Our approach employs a technique similar to "inpainting"
to rectify distorted projections, enabling the smooth construction of locally
coherent worlds. This provides extensive insight into the interrelation of
technology, perception, and experiential reality within human-computer
interaction.
Related papers
- WorldGen: From Text to Traversable and Interactive 3D Worlds [87.95088818329403]
We introduce WorldGen, a system that enables the automatic creation of large-scale, interactive 3D worlds directly from text prompts.<n>Our approach transforms natural language descriptions into fully textured environments that can be immediately explored or edited within standard game engines.<n>This work represents a step towards accessible, generative world-building at scale, advancing the frontier of 3D generative AI for applications in gaming, simulation, and immersive social environments.
arXiv Detail & Related papers (2025-11-20T22:13:18Z) - OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes [57.790894531046796]
Panorama-based 2D lifting has emerged as a promising technique to produce immersive, realistic, and diverse 3D environments.<n>In this work, we advance this technique to generate graphics-ready 3D scenes suitable for physically based rendering (PBR), relighting, and simulation.<n>Our key insight is to repurpose 2D generative models for panoramic perception of geometry, textures, and PBR materials.<n>Based on a lightweight and efficient cross-modal adapter structure, OmniX reuses 2D generative priors for a broad range of panoramic vision tasks.
arXiv Detail & Related papers (2025-10-30T17:59:51Z) - Terra: Explorable Native 3D World Model with Point Latents [74.90179419859415]
We present Terra, a native 3D world model that represents and generates explorable environments in an intrinsic 3D latent space.<n>Specifically, we propose a novel point-to-Gaussian variational autoencoder (P2G-VAE) that encodes 3D inputs into a latent point representation.<n>We then introduce a sparse point flow matching network (SPFlow) for generating the latent point representation, which simultaneously denoises the positions and features of the point latents.
arXiv Detail & Related papers (2025-10-16T17:59:56Z) - EvoWorld: Evolving Panoramic World Generation with Explicit 3D Memory [40.346684158976494]
EvoWorld bridges panoramic video generation with evolving 3D memory to enable spatially consistent long-horizon exploration.<n>Unlike prior state-of-the-arts that synthesize videos only, our key insight lies in exploiting this evolving 3D reconstruction as explicit spatial guidance.<n>To evaluate long-range exploration capabilities, we introduce the first comprehensive benchmark spanning synthetic outdoor environments, Habitat indoor scenes, and challenging real-world scenarios.
arXiv Detail & Related papers (2025-10-01T17:59:38Z) - PanoWorld-X: Generating Explorable Panoramic Worlds via Sphere-Aware Video Diffusion [87.13016347332943]
PanoWorld-X is a novel framework for high-fidelity and controllable panoramic video generation with diverse camera trajectories.<n>Our experiments demonstrate superior performance in various aspects, including motion range, control precision, and visual quality.
arXiv Detail & Related papers (2025-09-29T16:22:00Z) - NeoWorld: Neural Simulation of Explorable Virtual Worlds via Progressive 3D Unfolding [46.79724166827757]
We introduce NeoWorld, a framework for generating interactive 3D virtual worlds from a single input image.<n>Inspired by the on-demand worldbuilding concept in the science fiction novel Simulacron-3 (1964), our system constructs expansive environments.
arXiv Detail & Related papers (2025-09-29T08:24:28Z) - HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels [30.986527559921335]
HunyuanWorld 1.0 is a novel framework that combines the best of both worlds for generating immersive, explorable, and interactive 3D scenes from text and image conditions.<n>Our approach features three key advantages: 1) 360deg immersive experiences via panoramic world proxies; 2) mesh export capabilities for seamless compatibility with existing computer graphics pipelines; 3) disentangled object representations for augmented interactivity.
arXiv Detail & Related papers (2025-07-29T13:43:35Z) - WorldExplorer: Towards Generating Fully Navigable 3D Scenes [49.21733308718443]
WorldExplorer builds fully navigable 3D scenes with consistent visual quality across a wide range of viewpoints.<n>We generate multiple videos along short, pre-defined trajectories, that explore the scene in depth.<n>Our novel scene memory conditions each video on the most relevant prior views, while a collision-detection mechanism prevents degenerate results.
arXiv Detail & Related papers (2025-06-02T15:41:31Z) - GenSpace: Benchmarking Spatially-Aware Image Generation [76.98817635685278]
Humans intuitively compose and arrange scenes in the 3D space for photography.<n>Can advanced AI image generators plan scenes with similar 3D spatial awareness when creating images from text or image prompts?<n>We present GenSpace, a novel benchmark and evaluation pipeline to assess the spatial awareness of current image generation models.
arXiv Detail & Related papers (2025-05-30T17:59:26Z) - In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding [1.8130068086063336]
This paper introduces a novel perceptual-prior-guided 3D scene representation and panoptic understanding method.
It reformulates panoptic understanding within neural radiance fields as a linear assignment problem involving 2D semantics and instance recognition.
Experiments and ablation studies under challenging conditions, including synthetic and real-world scenes, demonstrate the proposed method's effectiveness.
arXiv Detail & Related papers (2024-10-06T15:49:58Z) - DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline.
Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement.
Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z) - Recent Trends in 3D Reconstruction of General Non-Rigid Scenes [104.07781871008186]
Reconstructing models of the real world, including 3D geometry, appearance, and motion of real scenes, is essential for computer graphics and computer vision.
It enables the synthesizing of photorealistic novel views, useful for the movie industry and AR/VR applications.
This state-of-the-art report (STAR) offers the reader a comprehensive summary of state-of-the-art techniques with monocular and multi-view inputs.
arXiv Detail & Related papers (2024-03-22T09:46:11Z) - OmniSCV: An Omnidirectional Synthetic Image Generator for Computer
Vision [5.2178708158547025]
We present a tool for generating datasets of omnidirectional images with semantic and depth information.
These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4.
We include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems.
arXiv Detail & Related papers (2024-01-30T14:40:19Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - PanoContext-Former: Panoramic Total Scene Understanding with a
Transformer [37.51637352106841]
Panoramic image enables deeper understanding and more holistic perception of $360circ$ surrounding environment.
In this paper, we propose a novel method using depth prior for holistic indoor scene understanding.
In addition, we introduce a real-world dataset for scene understanding, including photo-realistic panoramas, high-fidelity depth images, accurately annotated room layouts, and oriented object bounding boxes and shapes.
arXiv Detail & Related papers (2023-05-21T16:20:57Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting [149.1673041605155]
We address the problem of jointly estimating albedo, normals, depth and 3D spatially-varying lighting from a single image.
Most existing methods formulate the task as image-to-image translation, ignoring the 3D properties of the scene.
We propose a unified, learning-based inverse framework that formulates 3D spatially-varying lighting.
arXiv Detail & Related papers (2021-09-13T15:29:03Z) - GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed
Silhouettes [48.642181362172906]
We present an algorithm that learns a coarse 3D representation of objects from unposed multi-view 2D mask supervision.
In contrast to existing voxel-based methods for unposed object reconstruction, our approach learns to represent the generated shape and pose.
We show results on synthetic datasets with realistic lighting, and demonstrate object insertion with interactive posing.
arXiv Detail & Related papers (2021-06-24T17:47:58Z) - SAILenv: Learning in Virtual Visual Environments Made Simple [16.979621213790015]
We present a novel platform that allows researchers to experiment visual recognition in virtual 3D scenes.
A few lines of code are needed to interface every algorithm with the virtual world, and non-3D-graphics experts can easily customize the 3D environment itself.
Our framework yields pixel-level semantic and instance labeling, depth, and, to the best of our knowledge, it is the only one that provides motion-related information directly inherited from the 3D engine.
arXiv Detail & Related papers (2020-07-16T09:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.