Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition
- URL: http://arxiv.org/abs/2506.17201v1
- Date: Fri, 20 Jun 2025 17:50:37 GMT
- Title: Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition
- Authors: Jiaqi Li, Junshu Tang, Zhiyong Xu, Longhuang Wu, Yuan Zhou, Shuai Shao, Tianbao Yu, Zhiguo Cao, Qinglin Lu,
- Abstract summary: Hunyuan-GameCraft is a novel framework for high-dynamic interactive video generation in game environments.<n>To achieve fine-grained action control, we unify standard keyboard and mouse inputs into a shared camera representation space.<n>We propose a hybrid history-conditioned training strategy that extends video sequences autoregressively while preserving game scene information.
- Score: 18.789597877579986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in diffusion-based and controllable video generation have enabled high-quality and temporally coherent video synthesis, laying the groundwork for immersive interactive gaming experiences. However, current methods face limitations in dynamics, generality, long-term consistency, and efficiency, which limit the ability to create various gameplay videos. To address these gaps, we introduce Hunyuan-GameCraft, a novel framework for high-dynamic interactive video generation in game environments. To achieve fine-grained action control, we unify standard keyboard and mouse inputs into a shared camera representation space, facilitating smooth interpolation between various camera and movement operations. Then we propose a hybrid history-conditioned training strategy that extends video sequences autoregressively while preserving game scene information. Additionally, to enhance inference efficiency and playability, we achieve model distillation to reduce computational overhead while maintaining consistency across long temporal sequences, making it suitable for real-time deployment in complex interactive environments. The model is trained on a large-scale dataset comprising over one million gameplay recordings across over 100 AAA games, ensuring broad coverage and diversity, then fine-tuned on a carefully annotated synthetic dataset to enhance precision and control. The curated game scene data significantly improves the visual fidelity, realism and action controllability. Extensive experiments demonstrate that Hunyuan-GameCraft significantly outperforms existing models, advancing the realism and playability of interactive game video generation.
Related papers
- Matrix-Game: Interactive World Foundation Model [11.144250200432458]
Matrix-Game is an interactive world foundation model for controllable game world generation.<n>Our model adopts a controllable image-to-world generation paradigm, conditioned on a reference image, motion context, and user actions.<n>With over 17 billion parameters, Matrix-Game enables precise control over character actions and camera movements.
arXiv Detail & Related papers (2025-06-23T14:40:49Z) - PlayerOne: Egocentric World Simulator [73.88786358213694]
PlayerOne is the first egocentric realistic world simulator.<n>It generates egocentric videos that are strictly aligned with the real scene human motion of the user captured by an exocentric camera.
arXiv Detail & Related papers (2025-06-11T17:59:53Z) - ReCamMaster: Camera-Controlled Generative Rendering from A Single Video [72.42376733537925]
ReCamMaster is a camera-controlled generative video re-rendering framework.<n>It reproduces the dynamic scene of an input video at novel camera trajectories.<n>Our method also finds promising applications in video stabilization, super-resolution, and outpainting.
arXiv Detail & Related papers (2025-03-14T17:59:31Z) - GameFactory: Creating New Games with Generative Interactive Videos [32.98135338530966]
Generative videos have the potential to revolutionize game development by autonomously creating new content.<n>We present GameFactory, a framework for action-controlled scene-generalizable game video generation.<n> Experimental results demonstrate that GameFactory effectively generates open-domain action-controllable game videos.
arXiv Detail & Related papers (2025-01-14T18:57:21Z) - InterDyn: Controllable Interactive Dynamics with Video Diffusion Models [50.38647583839384]
We propose InterDyn, a framework that generates videos of interactive dynamics given an initial frame and a control signal encoding the motion of a driving object or actor.<n>Our key insight is that large video generation models can act as both neurals and implicit physics simulators'', having learned interactive dynamics from large-scale video data.
arXiv Detail & Related papers (2024-12-16T13:57:02Z) - GameGen-X: Interactive Open-world Game Video Generation [10.001128258269675]
We introduce GameGen-X, the first diffusion transformer model specifically designed for both generating and interactively controlling open-world game videos.<n>It simulates an array of game engine features, such as innovative characters, dynamic environments, complex actions, and diverse events.<n>It provides interactive controllability, predicting and future altering content based on the current clip, thus allowing for gameplay simulation.
arXiv Detail & Related papers (2024-11-01T17:59:17Z) - Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics [67.97235923372035]
We present Puppet-Master, an interactive video generative model that can serve as a motion prior for part-level dynamics.
At test time, given a single image and a sparse set of motion trajectories, Puppet-Master can synthesize a video depicting realistic part-level motion faithful to the given drag interactions.
arXiv Detail & Related papers (2024-08-08T17:59:38Z) - MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence [62.72540590546812]
MovieDreamer is a novel hierarchical framework that integrates the strengths of autoregressive models with diffusion-based rendering.
We present experiments across various movie genres, demonstrating that our approach achieves superior visual and narrative quality.
arXiv Detail & Related papers (2024-07-23T17:17:05Z) - Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video [23.484070818399]
Video2Game is a novel approach that automatically converts videos of real-world scenes into realistic and interactive game environments.
We show that we can not only produce highly-realistic renderings in real-time, but also build interactive games on top.
arXiv Detail & Related papers (2024-04-15T14:32:32Z) - CAGE: Unsupervised Visual Composition and Animation for Controllable Video Generation [42.475807996071175]
We introduce an unsupervised approach to controllable and compositional video generation.<n>Our model is trained from scratch on a dataset of unannotated videos.<n>It can compose plausible novel scenes and animate objects by placing object parts at the desired locations in space and time.
arXiv Detail & Related papers (2024-03-21T12:50:15Z) - UniCon: Universal Neural Controller For Physics-based Character Motion [70.45421551688332]
We propose a physics-based universal neural controller (UniCon) that learns to master thousands of motions with different styles by learning on large-scale motion datasets.
UniCon can support keyboard-driven control, compose motion sequences drawn from a large pool of locomotion and acrobatics skills and teleport a person captured on video to a physics-based virtual avatar.
arXiv Detail & Related papers (2020-11-30T18:51:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.