Fugu-MT 論文翻訳(概要): LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models

論文の概要: LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models

arxiv url: http://arxiv.org/abs/2603.07145v1
Date: Sat, 07 Mar 2026 10:31:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:13.900938
Title: LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
Title（参考訳）: LiveWorld: 生成的ビデオワールドモデルにおける視界外ダイナミクスのシミュレーション
Authors: Zicheng Duan, Jiatong Xia, Zeyu Zhang, Wenbo Zhang, Gengze Zhou, Chenhui Gou, Yefei He, Feng Chen, Xinyu Zhang, Lingqiao Liu,
Abstract要約: 近年の世代別ビデオワールドモデルは、視覚環境の進化をシミュレートすることを目的としており、観察者はカメラ制御によってシーンをインタラクティブに探索することができる。彼らは、世界は観察者の視野内でしか進化しないと暗黙的に仮定している。オブジェクトがオブザーバの視点を離れると、その状態はメモリ内で"凍結"され、その後同じ領域を再考しても、その間に発生すべき出来事を反映できないことがしばしばある。永続的な世界進化をサポートするために,ビデオワールドモデルを拡張する新しいフレームワークであるLiveWorldを提案する。
参考スコア（独自算出の注目度）: 32.92934803081681
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent generative video world models aim to simulate visual environment evolution, allowing an observer to interactively explore the scene via camera control. However, they implicitly assume that the world only evolves within the observer's field of view. Once an object leaves the observer's view, its state is "frozen" in memory, and revisiting the same region later often fails to reflect events that should have occurred in the meantime. In this work, we identify and formalize this overlooked limitation as the "out-of-sight dynamics" problem, which impedes video world models from representing a continuously evolving world. To address this issue, we propose LiveWorld, a novel framework that extends video world models to support persistent world evolution. Instead of treating the world as static observational memory, LiveWorld models a persistent global state composed of a static 3D background and dynamic entities that continue evolving even when unobserved. To maintain these unseen dynamics, LiveWorld introduces a monitor-based mechanism that autonomously simulates the temporal progression of active entities and synchronizes their evolved states upon revisiting, ensuring spatially coherent rendering. For evaluation, we further introduce LiveBench, a dedicated benchmark for the task of maintaining out-of-sight dynamics. Extensive experiments show that LiveWorld enables persistent event evolution and long-term scene consistency, bridging the gap between existing 2D observation-based memory and true 4D dynamic world simulation. The baseline and benchmark will be publicly available at https://zichengduan.github.io/LiveWorld/index.html.
Abstract（参考訳）: 近年の世代別ビデオワールドモデルは、視覚環境の進化をシミュレートすることを目的としており、観察者はカメラ制御によってシーンをインタラクティブに探索することができる。しかし、彼らは暗黙的に、世界はオブザーバーの視野内でしか進化しないと仮定している。オブジェクトがオブザーバの視点を離れると、その状態はメモリ内で"凍結"され、その後同じ領域を再考しても、その間に発生すべき出来事を反映できないことがしばしばある。本研究では,この限界を「視界外ダイナミクス」問題として認識・定式化することで,映像世界モデルが連続的に進化する世界を表現することを妨げている。この問題に対処するため、我々は、永続的な世界進化をサポートするために、ビデオワールドモデルを拡張する新しいフレームワークであるLiveWorldを提案する。 LiveWorldは、世界を静的な観測メモリとして扱う代わりに、静的な3D背景と動的エンティティで構成された永続的なグローバルステートをモデル化する。これらの目に見えないダイナミクスを維持するため、LiveWorldは、アクティブエンティティの時間的進行を自律的にシミュレートし、再考し、空間的に一貫性のあるレンダリングを保証するための、モニターベースのメカニズムを導入した。評価には、視界外ダイナミクスを維持するための専用のベンチマークであるLiveBenchも導入する。大規模な実験により、LiveWorldは永続的なイベント進化と長期のシーン一貫性を可能にし、既存の2D観測ベースのメモリと真の4Dダイナミックワールドシミュレーションのギャップを埋めることを示した。ベースラインとベンチマークはhttps://zichengduan.github.io/LiveWorld/index.htmlで公開される。

論文の概要: LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models

関連論文リスト