GameFactory: Creating New Games with Generative Interactive Videos
- URL: http://arxiv.org/abs/2501.08325v1
- Date: Tue, 14 Jan 2025 18:57:21 GMT
- Title: GameFactory: Creating New Games with Generative Interactive Videos
- Authors: Jiwen Yu, Yiran Qin, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu,
- Abstract summary: We present GameFactory, a framework focused on exploring scene generalization in game video generation.
We propose a multi-phase training strategy that decouples game style learning from action control, preserving open-domain generalization.
We extend our framework to enable autoregressive action-controllable game video generation, allowing the production of unlimited-length interactive game videos.
- Score: 32.98135338530966
- License:
- Abstract: Generative game engines have the potential to revolutionize game development by autonomously creating new content and reducing manual workload. However, existing video-based game generation methods fail to address the critical challenge of scene generalization, limiting their applicability to existing games with fixed styles and scenes. In this paper, we present GameFactory, a framework focused on exploring scene generalization in game video generation. To enable the creation of entirely new and diverse games, we leverage pre-trained video diffusion models trained on open-domain video data. To bridge the domain gap between open-domain priors and small-scale game dataset, we propose a multi-phase training strategy that decouples game style learning from action control, preserving open-domain generalization while achieving action controllability. Using Minecraft as our data source, we release GF-Minecraft, a high-quality and diversity action-annotated video dataset for research. Furthermore, we extend our framework to enable autoregressive action-controllable game video generation, allowing the production of unlimited-length interactive game videos. Experimental results demonstrate that GameFactory effectively generates open-domain, diverse, and action-controllable game videos, representing a significant step forward in AI-driven game generation. Our dataset and project page are publicly available at \url{https://vvictoryuki.github.io/gamefactory/}.
Related papers
- Generative Video Propagation [87.15843701018099]
Our framework, GenProp, encodes the original video with a selective content encoder and propagates the changes made to the first frame using an image-to-video generation model.
Experiment results demonstrate the leading performance of our model in various video tasks.
arXiv Detail & Related papers (2024-12-27T17:42:29Z) - Video Creation by Demonstration [59.389591010842636]
We present $delta$-Diffusion, a self-supervised training approach that learns from unlabeled videos by conditional future frame prediction.
By leveraging a video foundation model with an appearance bottleneck design on top, we extract action latents from demonstration videos for conditioning the generation process.
Empirically, $delta$-Diffusion outperforms related baselines in terms of both human preference and large-scale machine evaluations.
arXiv Detail & Related papers (2024-12-12T18:41:20Z) - GameGen-X: Interactive Open-world Game Video Generation [10.001128258269675]
We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos.
It simulates an array of game engine features, such as innovative characters, dynamic environments, complex actions, and diverse events.
It provides interactive controllability, predicting and future altering content based on the current clip, thus allowing for gameplay simulation.
arXiv Detail & Related papers (2024-11-01T17:59:17Z) - GAVEL: Generating Games Via Evolution and Language Models [40.896938709468465]
We explore the generation of novel games in the Ludii game description language.
We train a model that intelligently mutates and recombines games and mechanics expressed as code.
A sample of the generated games are available to play online through the Ludii portal.
arXiv Detail & Related papers (2024-07-12T16:08:44Z) - A Text-to-Game Engine for UGC-Based Role-Playing Games [6.5715027492220734]
This paper introduces a novel framework for a text-to-game engine that leverages foundation models to transform simple textual inputs into intricate, multi-modal RPG experiences.
The engine dynamically generates game narratives, integrating text, visuals, and mechanics, while adapting characters, environments, and gameplay in realtime based on player interactions.
arXiv Detail & Related papers (2024-07-11T05:33:19Z) - Towards General Game Representations: Decomposing Games Pixels into
Content and Style [2.570570340104555]
Learning pixel representations of games can benefit artificial intelligence across several downstream tasks.
This paper explores how generalizable pre-trained computer vision encoders can be for such tasks.
We employ a pre-trained Vision Transformer encoder and a decomposition technique based on game genres to obtain separate content and style embeddings.
arXiv Detail & Related papers (2023-07-20T17:53:04Z) - Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion
Models [68.85478477006178]
We present a Promptable Game Model (PGM) for neural video game simulators.
It allows a user to play the game by prompting it with high- and low-level action sequences.
Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt.
Our method significantly outperforms existing neural video game simulators in terms of rendering quality and unlocks applications beyond the capabilities of the current state of the art.
arXiv Detail & Related papers (2023-03-23T17:43:17Z) - Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance.
We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning.
We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z) - Playable Video Generation [47.531594626822155]
We aim at allowing a user to control the generated video by selecting a discrete action at every time step as when playing a video game.
The difficulty of the task lies both in learning semantically consistent actions and in generating realistic videos conditioned on the user input.
We propose a novel framework for PVG that is trained in a self-supervised manner on a large dataset of unlabelled videos.
arXiv Detail & Related papers (2021-01-28T18:55:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.