Fugu-MT 論文翻訳(概要): Toward Physically Consistent Driving Video World Models under Challenging Trajectories

論文の概要: Toward Physically Consistent Driving Video World Models under Challenging Trajectories

arxiv url: http://arxiv.org/abs/2603.24506v1
Date: Wed, 25 Mar 2026 16:47:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-26 21:06:11.393497
Title: Toward Physically Consistent Driving Video World Models under Challenging Trajectories
Title（参考訳）: 追従軌道下での物理的に一貫性のある映像世界モデルの実現に向けて
Authors: Jiawei Zhou, Zhenxin Zhu, Lingyi Du, Linye Lyu, Lijun Zhou, Zhanqian Wu, Hongcheng Luo, Zhuotao Tian, Bing Wang, Guang Chen, Hangjun Ye, Haiyang Sun, Yu Li,
Abstract要約: PhyGenesis(フィジェネシス)は、高い視覚的忠実度と強力な物理的一貫性を持つドライブビデオを生成するように設計された世界モデルである。実世界の運転映像に加えて,CARLAシミュレータを用いて多様な挑戦的な運転シナリオを生成する。この挑戦的軌道学習戦略は、軌道修正を可能にし、物理的に一貫した映像生成を促進する。
参考スコア（独自算出の注目度）: 26.053956037261496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video generation models have shown strong potential as world models for autonomous driving simulation. However, existing approaches are primarily trained on real-world driving datasets, which mostly contain natural and safe driving scenarios. As a result, current models often fail when conditioned on challenging or counterfactual trajectories-such as imperfect trajectories generated by simulators or planning systems-producing videos with severe physical inconsistencies and artifacts. To address this limitation, we propose PhyGenesis, a world model designed to generate driving videos with high visual fidelity and strong physical consistency. Our framework consists of two key components: (1) a physical condition generator that transforms potentially invalid trajectory inputs into physically plausible conditions, and (2) a physics-enhanced video generator that produces high-fidelity multi-view driving videos under these conditions. To effectively train these components, we construct a large-scale, physics-rich heterogeneous dataset. Specifically, in addition to real-world driving videos, we generate diverse challenging driving scenarios using the CARLA simulator, from which we derive supervision signals that guide the model to learn physically grounded dynamics under extreme conditions. This challenging-trajectory learning strategy enables trajectory correction and promotes physically consistent video generation. Extensive experiments demonstrate that PhyGenesis consistently outperforms state-of-the-art methods, especially on challenging trajectories. Our project page is available at: https://wm-research.github.io/PhyGenesis/.
Abstract（参考訳）: ビデオ生成モデルは、自律運転シミュレーションの世界モデルとして強い可能性を示している。しかし、既存のアプローチは主に、自然で安全な運転シナリオを含む現実世界の運転データセットに基づいて訓練されている。結果として、現在のモデルは、シミュレータが生成した不完全な軌跡や、厳しい物理的不整合とアーティファクトを持つシステム生成ビデオのような、挑戦的または反ファクト的な軌跡に条件付けされたときに失敗することが多い。この制限に対処するため,高精細度で高精細度な映像を生成するための世界モデルであるPhyGenesisを提案する。本フレームワークは,(1) 潜在的に無効な軌道入力を物理的に可算な状態に変換する物理条件生成装置,(2) 高忠実度マルチビュー駆動ビデオを生成する物理拡張ビデオ生成装置の2つのキーコンポーネントから構成される。これらのコンポーネントを効果的に訓練するために、我々は大規模で物理に富む異種データセットを構築した。具体的には、実世界の運転映像に加えて、CARLAシミュレーターを用いて多様な挑戦的な運転シナリオを生成し、そこからモデルに極端条件下で物理的に接地されたダイナミクスを学習させる監督シグナルを導出する。この挑戦的軌道学習戦略は、軌道修正を可能にし、物理的に一貫した映像生成を促進する。大規模な実験では、PhyGenesisは最先端の手法、特に挑戦的な軌道において一貫して優れていた。私たちのプロジェクトページは、https://wm-research.github.io/PhyGenesis/.com/で公開されています。

論文の概要: Toward Physically Consistent Driving Video World Models under Challenging Trajectories

関連論文リスト