Fugu-MT 論文翻訳(概要): What if? Emulative Simulation with World Models for Situated Reasoning

論文の概要: What if? Emulative Simulation with World Models for Situated Reasoning

arxiv url: http://arxiv.org/abs/2603.06445v1
Date: Fri, 06 Mar 2026 16:37:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:46.207655
Title: What if? Emulative Simulation with World Models for Situated Reasoning
Title（参考訳）: 仮定推論のための世界モデルを用いたエミュレーティブシミュレーション
Authors: Ruiping Liu, Yufan Chen, Yuheng Zhang, Junwei Zheng, Kunyu Peng, Chengzhi Wu, Chenguang Huang, Di Wen, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen,
Abstract要約: WanderDreamは、精神探査のエミュレータシミュレーションのために設計された最初の大規模データセットである。 WanderDream-Genは、HM3D、ScanNet++、および実世界のキャプチャから1,088の実際のシーンにわたる15.8Kのパノラマビデオで構成されている。 WanderDream-QAは158Kの質問応答ペアを含み、各軌道に沿った開始状態、経路、終了状態をカバーする。
参考スコア（独自算出の注目度）: 51.770776877972956
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Situated reasoning often relies on active exploration, yet in many real-world scenarios such exploration is infeasible due to physical constraints of robots or safety concerns of visually impaired users. Given only a limited observation, can an agent mentally simulate a future trajectory toward a target situation and answer spatial what-if questions? We introduce WanderDream, the first large-scale dataset designed for the emulative simulation of mental exploration, enabling models to reason without active exploration. WanderDream-Gen comprises 15.8K panoramic videos across 1,088 real scenes from HM3D, ScanNet++, and real-world captures, depicting imagined trajectories from current viewpoints to target situations. WanderDream-QA contains 158K question-answer pairs, covering starting states, paths, and end states along each trajectory to comprehensively evaluate exploration-based reasoning. Extensive experiments with world models and MLLMs demonstrate (1) that mental exploration is essential for situated reasoning, (2) that world models achieve compelling performance on WanderDream-Gen, (3) that imagination substantially facilitates reasoning on WanderDream-QA, and (4) that WanderDream data exhibit remarkable transferability to real-world scenarios. The source code and all data will be released.
Abstract（参考訳）: 特定の推論は、しばしば活発な探索に依存するが、現実の多くのシナリオでは、ロボットの物理的制約や視覚障害者の安全上の懸念のため、そのような探索は不可能である。限られた観察しか持たず、エージェントは、対象の状況に対する将来の軌跡を精神的にシミュレートし、空間的な何に対する質問に答えることができるか? We introduced WanderDream, a first large-scale dataset designed for emulative Simulation of mental exploration。 WanderDream-Genは、HM3D、ScanNet++、および現実世界のキャプチャから1,088の実際のシーンに15.8Kのパノラマビデオで構成され、現在の視点からターゲットの状況への想像上の軌跡を描いている。 WanderDream-QAは158Kの質問応答ペアを含み、各軌道に沿って開始状態、経路、終了状態をカバーし、探索に基づく推論を包括的に評価する。ワールドモデルとMLLMを用いた大規模な実験では,(1)位置推論には心的探索が不可欠であること,(2)世界モデルがWanderDream-Gen上で魅力的なパフォーマンスを達成すること,(3)想像力がWanderDream-QA上での推論を大幅に促進すること,(4)WanderDreamデータが実世界のシナリオへの顕著な伝達性を示すこと,などが示されている。ソースコードとすべてのデータがリリースされる。

論文の概要: What if? Emulative Simulation with World Models for Situated Reasoning

関連論文リスト