Fugu-MT 論文翻訳(概要): Interactive World Simulator for Robot Policy Training and Evaluation

論文の概要: Interactive World Simulator for Robot Policy Training and Evaluation

arxiv url: http://arxiv.org/abs/2603.08546v1
Date: Mon, 09 Mar 2026 16:13:32 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:16.397515
Title: Interactive World Simulator for Robot Policy Training and Evaluation
Title（参考訳）: ロボット政策訓練と評価のための対話型世界シミュレータ
Authors: Yixuan Wang, Rhythm Syed, Fangyu Wu, Mengchao Zhang, Aykut Onol, Jose Barreiros, Hooshang Nayyeri, Tony Dear, Huan Zhang, Yunzhu Li,
Abstract要約: ロボットインタラクションデータセットからインタラクティブな世界モデルを構築するためのフレームワークであるInteractive World Simulatorを提案する。我々の実験では、学習された世界モデルが相互作用に一貫性のあるピクセルレベルの予測を生成する。我々は,世界モデル生成データに基づいてトレーニングされたポリシーが,同じ量の実世界のデータでトレーニングされたポリシーと相容れないことを発見した。
参考スコア（独自算出の注目度）: 21.481187472784047
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Action-conditioned video prediction models (often referred to as world models) have shown strong potential for robotics applications, but existing approaches are often slow and struggle to capture physically consistent interactions over long horizons, limiting their usefulness for scalable robot policy training and evaluation. We present Interactive World Simulator, a framework for building interactive world models from a moderate-sized robot interaction dataset. Our approach leverages consistency models for both image decoding and latent-space dynamics prediction, enabling fast and stable simulation of physical interactions. In our experiments, the learned world models produce interaction-consistent pixel-level predictions and support stable long-horizon interactions for more than 10 minutes at 15 FPS on a single RTX 4090 GPU. Our framework enables scalable demonstration collection solely within the world models to train state-of-the-art imitation policies. Through extensive real-world evaluation across diverse tasks involving rigid objects, deformable objects, object piles, and their interactions, we find that policies trained on world-model-generated data perform comparably to those trained on the same amount of real-world data. Additionally, we evaluate policies both within the world models and in the real world across diverse tasks, and observe a strong correlation between simulated and real-world performance. Together, these results establish the Interactive World Simulator as a stable and physically consistent surrogate for scalable robotic data generation and faithful, reproducible policy evaluation.
Abstract（参考訳）: アクション条件付きビデオ予測モデル(しばしば世界モデルと呼ばれる)は、ロボット工学の応用に強い可能性を示しているが、既存のアプローチは、しばしば遅く、長い地平線上での物理的に一貫した相互作用を捉え、スケーラブルなロボットポリシートレーニングと評価に有用性を制限するのに苦労している。中程度のロボットインタラクションデータセットからインタラクティブな世界モデルを構築するためのフレームワークであるInteractive World Simulatorを提案する。提案手法では,画像復号化と遅延空間の動的予測の両方に一貫性モデルを適用し,物理的相互作用の高速かつ安定したシミュレーションを可能にする。実験では,RTX 4090 GPU上での15FPSで10分以上にわたって,相互作用に一貫性のある画素レベルの予測を行い,安定な長水平相互作用をサポートする。我々のフレームワークは、世界モデル内でのみスケーラブルなデモコレクションを可能にし、最先端の模倣ポリシーをトレーニングします。剛体オブジェクト,変形可能なオブジェクト,オブジェクトパイル,およびそれらの相互作用を含む多種多様なタスクを対象とした広範囲な実世界の評価を通じて,世界モデル生成データに基づいてトレーニングされたポリシーが,同一の実世界のデータに基づいてトレーニングされたポリシーと相容れないことが判明した。さらに,世界モデル内および実世界における多種多様なタスクにおける政策評価を行い,シミュレーションと実世界のパフォーマンスの強い相関関係を観察する。これらの結果は、スケーラブルなロボットデータ生成と忠実で再現可能なポリシー評価のための安定的で物理的に一貫したサロゲートとして、Interactive World Simulatorを確立する。

論文の概要: Interactive World Simulator for Robot Policy Training and Evaluation

関連論文リスト