Related papers: Playable Game Generation

Playable Game Generation

URL: http://arxiv.org/abs/2412.00887v1
Date: Sun, 01 Dec 2024 16:53:02 GMT
Title: Playable Game Generation
Authors: Mingyu Yang, Junyou Li, Zhongbin Fang, Sheng Chen, Yangbin Yu, Qiang Fu, Wei Yang, Deheng Ye,
Abstract summary: We propose emphPlayGen, which encompasses game data generation, an autoregressive DiT-based diffusion model, and a playability-based evaluation framework.<n>PlayGen achieves real-time interaction, ensures sufficient visual quality, and provides accurate interactive mechanics simulation.
Score: 22.17100581717806
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, Artificial Intelligence Generated Content (AIGC) has advanced from text-to-image generation to text-to-video and multimodal video synthesis. However, generating playable games presents significant challenges due to the stringent requirements for real-time interaction, high visual quality, and accurate simulation of game mechanics. Existing approaches often fall short, either lacking real-time capabilities or failing to accurately simulate interactive mechanics. To tackle the playability issue, we propose a novel method called \emph{PlayGen}, which encompasses game data generation, an autoregressive DiT-based diffusion model, and a comprehensive playability-based evaluation framework. Validated on well-known 2D and 3D games, PlayGen achieves real-time interaction, ensures sufficient visual quality, and provides accurate interactive mechanics simulation. Notably, these results are sustained even after over 1000 frames of gameplay on an NVIDIA RTX 2060 GPU. Our code is publicly available: https://github.com/GreatX3/Playable-Game-Generation. Our playable demo generated by AI is: http://124.156.151.207.

Related papers

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition [18.789597877579986]
Hunyuan-GameCraft is a novel framework for high-dynamic interactive video generation in game environments.<n>To achieve fine-grained action control, we unify standard keyboard and mouse inputs into a shared camera representation space.<n>We propose a hybrid history-conditioned training strategy that extends video sequences autoregressively while preserving game scene information.
arXiv Detail & Related papers (2025-06-20T17:50:37Z)
WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions [49.43000450846916]
WonderPlay is a framework integrating physics simulation and video generation.<n>It generates action-conditioned dynamic 3D scenes from a single image.<n>WonderPlay enables users to interact with various scenes of diverse content.
arXiv Detail & Related papers (2025-05-23T17:59:24Z)
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction [58.240114139186275]
Recently, a pioneering approach for infinite anime life simulation employs large language models (LLMs) to translate multi-turn text dialogues into language instructions for image generation. We propose AnimeGamer, which is built upon Multimodal Large Language Models (MLLMs) to generate each game state. We introduce novel action-aware multimodal representations to represent animation shots, which can be decoded into high-quality video clips.
arXiv Detail & Related papers (2025-04-01T17:57:18Z)
Unbounded: A Generative Infinite Game of Character Life Simulation [68.37260000219479]
We introduce the concept of a generative infinite game, a video game that transcends the traditional boundaries of finite, hard-coded systems by using generative models. We leverage recent advances in generative AI to create Unbounded: a game of character life simulation that is fully encapsulated in generative models.
arXiv Detail & Related papers (2024-10-24T17:59:31Z)
TextToon: Real-Time Text Toonify Head Avatar from Single Video [34.07760625281835]
We propose TextToon, a method to generate a drivable toonified avatar. Given a short monocular video sequence and a written instruction about the avatar style, our model can generate a high-fidelity toonified avatar.
arXiv Detail & Related papers (2024-09-23T15:04:45Z)
Diffusion Models Are Real-Time Game Engines [8.472305302767259]
We present GameNGen, the first game engine powered entirely by a neural model. GameNGen extracts gameplay and uses it to generate a playable environment. Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression.
arXiv Detail & Related papers (2024-08-27T07:46:07Z)
VideoPhy: Evaluating Physical Commonsense for Video Generation [93.28748850301949]
We present VideoPhy, a benchmark designed to assess whether the generated videos follow physical commonsense for real-world activities. We then generate videos conditioned on captions from diverse state-of-the-art text-to-video generative models. Our human evaluation reveals that the existing models severely lack the ability to generate videos adhering to the given text prompts.
arXiv Detail & Related papers (2024-06-05T17:53:55Z)
Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models [68.85478477006178]
We present a Promptable Game Model (PGM) for neural video game simulators. It allows a user to play the game by prompting it with high- and low-level action sequences. Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt. Our method significantly outperforms existing neural video game simulators in terms of rendering quality and unlocks applications beyond the capabilities of the current state of the art.
arXiv Detail & Related papers (2023-03-23T17:43:17Z)
Steps towards prompt-based creation of virtual worlds [1.2891210250935143]
We show that prompt-based methods can both accelerate in-VR level editing, as well as can become part of gameplay. We conclude by discussing impending challenges of AI-assisted co-creation in VR.
arXiv Detail & Related papers (2022-11-10T21:13:04Z)
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration [46.19536568693307]
Multimodal video-audio-text understanding and generation can benefit from datasets that are narrow but rich. We present a large-scale video-audio-text dataset MUGEN, collected using the open-sourced platform game CoinRun. We sample 375K video clips (3.2s each) and collect text descriptions from human annotators.
arXiv Detail & Related papers (2022-04-17T17:59:09Z)
Megaverse: Simulating Embodied Agents at One Million Experiences per Second [75.1191260838366]
We present Megaverse, a new 3D simulation platform for reinforcement learning and embodied AI research. Megaverse is up to 70x faster than DeepMind Lab in fully-shaded 3D scenes with interactive objects. We use Megaverse to build a new benchmark that consists of several single-agent and multi-agent tasks.
arXiv Detail & Related papers (2021-07-17T03:16:25Z)
Model-Based Reinforcement Learning for Atari [89.3039240303797]
We show how video prediction models can enable agents to solve Atari games with fewer interactions than model-free methods. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment.
arXiv Detail & Related papers (2019-03-01T15:40:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.