The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
- URL: http://arxiv.org/abs/2412.03568v1
- Date: Wed, 04 Dec 2024 18:59:05 GMT
- Title: The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
- Authors: Ruili Feng, Han Zhang, Zhantao Yang, Jie Xiao, Zhilei Shu, Zhiheng Liu, Andy Zheng, Yukun Huang, Yu Liu, Hongyang Zhang,
- Abstract summary: We present The Matrix, the first foundational realistic world simulator capable of generating continuous 720p real-scene video streams.<n>The Matrix allows users to traverse diverse terrains in continuous, uncut hour-long sequences.<n>The Matrix can simulate a BMW X3 driving through an office setting--an environment present in neither gaming data nor real-world sources.
- Score: 16.075784652681172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present The Matrix, the first foundational realistic world simulator capable of generating continuous 720p high-fidelity real-scene video streams with real-time, responsive control in both first- and third-person perspectives, enabling immersive exploration of richly dynamic environments. Trained on limited supervised data from AAA games like Forza Horizon 5 and Cyberpunk 2077, complemented by large-scale unsupervised footage from real-world settings like Tokyo streets, The Matrix allows users to traverse diverse terrains -- deserts, grasslands, water bodies, and urban landscapes -- in continuous, uncut hour-long sequences. Operating at 16 FPS, the system supports real-time interactivity and demonstrates zero-shot generalization, translating virtual game environments to real-world contexts where collecting continuous movement data is often infeasible. For example, The Matrix can simulate a BMW X3 driving through an office setting--an environment present in neither gaming data nor real-world sources. This approach showcases the potential of AAA game data to advance robust world models, bridging the gap between simulations and real-world applications in scenarios with limited data.
Related papers
- From Virtual Games to Real-World Play [34.022575851703145]
We introduce RealPlay, a neural network-based real-world game engine that enables interactive video generation from user control signals.<n>RealPlay aims to produce photorealistic, temporally consistent video sequences that resemble real-world footage.
arXiv Detail & Related papers (2025-06-23T17:59:53Z) - Matrix-Game: Interactive World Foundation Model [11.144250200432458]
Matrix-Game is an interactive world foundation model for controllable game world generation.<n>Our model adopts a controllable image-to-world generation paradigm, conditioned on a reference image, motion context, and user actions.<n>With over 17 billion parameters, Matrix-Game enables precise control over character actions and camera movements.
arXiv Detail & Related papers (2025-06-23T14:40:49Z) - R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation [78.26308457952636]
This paper introduces R3D2, a lightweight, one-step diffusion model designed to overcome limitations in autonomous driving simulation.<n>It enables realistic insertion of complete 3D assets into existing scenes by generating plausible rendering effects-such as shadows and consistent lighting-in real time.<n>We show that R3D2 significantly enhances the realism of inserted assets, enabling use-cases like text-to-3D asset insertion and cross-scene/dataset object transfer.
arXiv Detail & Related papers (2025-06-09T14:50:19Z) - UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI [37.47562766916571]
We introduce UnrealZoo, a rich collection of photo-realistic 3D virtual worlds built on Unreal Engine.
We offer a variety of playable entities for embodied AI agents.
arXiv Detail & Related papers (2024-12-30T14:31:01Z) - DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation [54.02069690134526]
We propose DrivingSphere, a realistic and closed-loop simulation framework.
Its core idea is to build 4D world representation and generate real-life and controllable driving scenarios.
By providing a dynamic and realistic simulation environment, DrivingSphere enables comprehensive testing and validation of autonomous driving algorithms.
arXiv Detail & Related papers (2024-11-18T03:00:33Z) - Unbounded: A Generative Infinite Game of Character Life Simulation [68.37260000219479]
We introduce the concept of a generative infinite game, a video game that transcends the traditional boundaries of finite, hard-coded systems by using generative models.
We leverage recent advances in generative AI to create Unbounded: a game of character life simulation that is fully encapsulated in generative models.
arXiv Detail & Related papers (2024-10-24T17:59:31Z) - DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving [30.024309081789053]
DriveArena is a high-fidelity closed-loop simulation system designed for driving agents navigating in real scenarios.
It features Traffic Manager, a traffic simulator capable of generating realistic traffic flow on any worldwide street map, and World Dreamer, a high-fidelity conditional generative model with infinite autoregression.
arXiv Detail & Related papers (2024-08-01T09:32:01Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving
Without Real Data [56.49494318285391]
We present Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving.
This is done by learning to translate randomized simulation images into simulated segmentation and depth maps.
This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world.
arXiv Detail & Related papers (2022-10-25T17:50:36Z) - Data generation using simulation technology to improve perception
mechanism of autonomous vehicles [0.0]
We will demonstrate the effectiveness of combining data gathered from the real world with data generated in the simulated world to train perception systems.
We will also propose a multi-level deep learning perception framework that aims to emulate a human learning experience.
arXiv Detail & Related papers (2022-07-01T03:42:33Z) - SceneGen: Learning to Generate Realistic Traffic Scenes [92.98412203941912]
We present SceneGen, a neural autoregressive model of traffic scenes that eschews the need for rules and distributions.
We demonstrate SceneGen's ability to faithfully model distributions of real traffic scenes.
arXiv Detail & Related papers (2021-01-16T22:51:43Z) - LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World [84.57894492587053]
We develop a novel simulator that captures both the power of physics-based and learning-based simulation.
We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation.
We showcase LiDARsim's usefulness for perception algorithms-testing on long-tail events and end-to-end closed-loop evaluation on safety-critical scenarios.
arXiv Detail & Related papers (2020-06-16T17:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.