OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
- URL: http://arxiv.org/abs/2510.26800v1
- Date: Thu, 30 Oct 2025 17:59:51 GMT
- Title: OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
- Authors: Yukun Huang, Jiwen Yu, Yanning Zhou, Jianan Wang, Xintao Wang, Pengfei Wan, Xihui Liu,
- Abstract summary: Panorama-based 2D lifting has emerged as a promising technique to produce immersive, realistic, and diverse 3D environments.<n>In this work, we advance this technique to generate graphics-ready 3D scenes suitable for physically based rendering (PBR), relighting, and simulation.<n>Our key insight is to repurpose 2D generative models for panoramic perception of geometry, textures, and PBR materials.<n>Based on a lightweight and efficient cross-modal adapter structure, OmniX reuses 2D generative priors for a broad range of panoramic vision tasks.
- Score: 57.790894531046796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There are two prevalent ways to constructing 3D scenes: procedural generation and 2D lifting. Among them, panorama-based 2D lifting has emerged as a promising technique, leveraging powerful 2D generative priors to produce immersive, realistic, and diverse 3D environments. In this work, we advance this technique to generate graphics-ready 3D scenes suitable for physically based rendering (PBR), relighting, and simulation. Our key insight is to repurpose 2D generative models for panoramic perception of geometry, textures, and PBR materials. Unlike existing 2D lifting approaches that emphasize appearance generation and ignore the perception of intrinsic properties, we present OmniX, a versatile and unified framework. Based on a lightweight and efficient cross-modal adapter structure, OmniX reuses 2D generative priors for a broad range of panoramic vision tasks, including panoramic perception, generation, and completion. Furthermore, we construct a large-scale synthetic panorama dataset containing high-quality multimodal panoramas from diverse indoor and outdoor scenes. Extensive experiments demonstrate the effectiveness of our model in panoramic visual perception and graphics-ready 3D scene generation, opening new possibilities for immersive and physically realistic virtual world generation.
Related papers
- WorldGrow: Generating Infinite 3D World [75.81531067447203]
We tackle the challenge of generating the infinitely extendable 3D world -- large, continuous environments with coherent geometry and realistic appearance.<n>We propose WorldGrow, a hierarchical framework for unbounded 3D scene synthesis.<n>Our method features three core components: (1) a data curation pipeline that extracts high-quality scene blocks for training, making the 3D structured latent representations suitable for scene generation; (2) a 3D block inpainting mechanism that enables context-aware scene extension; and (3) a coarse-to-fine generation strategy that ensures both global layout plausibility and local geometric/textural fidelity.
arXiv Detail & Related papers (2025-10-24T17:39:52Z) - PanoWorld-X: Generating Explorable Panoramic Worlds via Sphere-Aware Video Diffusion [87.13016347332943]
PanoWorld-X is a novel framework for high-fidelity and controllable panoramic video generation with diverse camera trajectories.<n>Our experiments demonstrate superior performance in various aspects, including motion range, control precision, and visual quality.
arXiv Detail & Related papers (2025-09-29T16:22:00Z) - TiP4GEN: Text to Immersive Panorama 4D Scene Generation [82.8444414014506]
TiP4GEN is a text-to-dynamic panorama scene generation framework.<n>It enables fine-grained content control and synthesizes motion-rich, geometry-consistent panoramic 4D scenes.<n> TiP4GEN integrates panorama video generation and dynamic scene reconstruction to create 360-degree immersive virtual environments.
arXiv Detail & Related papers (2025-08-17T16:02:24Z) - Matrix-3D: Omnidirectional Explorable 3D World Generation [20.568791715708134]
We propose Matrix-3D, a framework that utilize panoramic representation for wide-coverage omnidirectional 3D world generation.<n>We first train a trajectory-guided panoramic video diffusion model that employs scene mesh renders as condition.<n>To lift the panorama scene video to 3D world, we propose two separate methods: (1) a feed-forward large panorama reconstruction model for rapid 3D scene reconstruction and (2) an optimization-based pipeline for accurate and detailed 3D scene reconstruction.
arXiv Detail & Related papers (2025-08-11T15:29:57Z) - Top2Pano: Learning to Generate Indoor Panoramas from Top-Down View [1.182769785560032]
Top2Pano is an end-to-end model for realistic indoor panoramas from top-down views.<n>Our method estimates volumetric occupancy to infer 3D structures, then uses volumetric rendering to generate coarse color and depth panoramas.
arXiv Detail & Related papers (2025-07-28T22:32:41Z) - DreamCube: 3D Panorama Generation via Multi-plane Synchronization [17.690754213112108]
3D panorama synthesis is a promising yet challenging task that demands high-quality and diverse visual appearance and geometry of the generated omnidirectional content.<n>Existing methods leverage rich image priors from pre-trained 2D foundation models to circumvent the scarcity of 3D panoramic data.<n>In this work, we demonstrate that by applying multi-plane synchronization to the operators from 2D foundation models, their capabilities can be seamlessly extended to the omnidirectional domain.
arXiv Detail & Related papers (2025-06-20T17:55:06Z) - LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation [105.52153675890408]
3D immersive scene generation is a challenging yet critical task in computer vision and graphics.<n>Layerpano3D is a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt.
arXiv Detail & Related papers (2024-08-23T17:50:23Z) - Pano2Room: Novel View Synthesis from a Single Indoor Panorama [20.262621556667852]
Pano2Room is designed to automatically reconstruct high-quality 3D indoor scenes from a single panoramic image.
The key idea is to initially construct a preliminary mesh from the input panorama, and iteratively refine this mesh using a panoramic RGBD inpainter.
The refined mesh is converted into a 3D Gaussian Splatting field and trained with the collected pseudo novel views.
arXiv Detail & Related papers (2024-08-21T08:19:12Z) - HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions [31.342899807980654]
3D scene generation is in high demand across various domains, including virtual reality, gaming, and the film industry.
We introduce HoloDreamer, a framework that first generates high-definition panorama as a holistic initialization of the full 3D scene.
We then leverage 3D Gaussian Splatting (3D-GS) to quickly reconstruct the 3D scene, thereby facilitating the creation of view-consistent and fully enclosed 3D scenes.
arXiv Detail & Related papers (2024-07-21T14:52:51Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.