Fugu-MT 論文翻訳(概要): FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI

論文の概要: FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI

arxiv url: http://arxiv.org/abs/2511.13524v1
Date: Mon, 17 Nov 2025 15:58:46 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-18 14:36:25.344809
Title: FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI
Title（参考訳）: FreeAskWorld:人間中心の体操AIのためのインタラクティブでクローズドなループシミュレータ
Authors: Yuhang Peng, Yizhou Pan, Xinning He, Jihaoyu Yang, Xinyu Yin, Han Wang, Xiaoji Zheng, Chao Gao, Jiangtao Gong,
Abstract要約: FreeAskWorldは対話型シミュレーションフレームワークで、大規模言語モデルを統合して、ハイレベルな振る舞い計画とセマンティックな基礎的なインタラクションを実現する。我々のフレームワークはスケーラブルでリアルなヒューマンエージェントシミュレーションをサポートし、多様な実施タスクに適したモジュラーデータ生成パイプラインを含んでいる。再構成環境,6種類のタスクタイプ,16のコアオブジェクトカテゴリ,63,429の注釈付きサンプルフレーム,17時間以上のインタラクションデータからなる大規模ベンチマークデータセットであるFreeAskWorldを公開・公開する。
参考スコア（独自算出の注目度）: 24.545163508739943
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: As embodied intelligence emerges as a core frontier in artificial intelligence research, simulation platforms must evolve beyond low-level physical interactions to capture complex, human-centered social behaviors. We introduce FreeAskWorld, an interactive simulation framework that integrates large language models (LLMs) for high-level behavior planning and semantically grounded interaction, informed by theories of intention and social cognition. Our framework supports scalable, realistic human-agent simulations and includes a modular data generation pipeline tailored for diverse embodied tasks.To validate the framework, we extend the classic Vision-and-Language Navigation (VLN) task into a interaction enriched Direction Inquiry setting, wherein agents can actively seek and interpret navigational guidance. We present and publicly release FreeAskWorld, a large-scale benchmark dataset comprising reconstructed environments, six diverse task types, 16 core object categories, 63,429 annotated sample frames, and more than 17 hours of interaction data to support training and evaluation of embodied AI systems. We benchmark VLN models, and human participants under both open-loop and closed-loop settings. Experimental results demonstrate that models fine-tuned on FreeAskWorld outperform their original counterparts, achieving enhanced semantic understanding and interaction competency. These findings underscore the efficacy of socially grounded simulation frameworks in advancing embodied AI systems toward sophisticated high-level planning and more naturalistic human-agent interaction. Importantly, our work underscores that interaction itself serves as an additional information modality.
Abstract（参考訳）: 人工知能研究のコアフロンティアとして具現化されるにつれ、シミュレーションプラットフォームは、複雑な人間中心の社会的行動を捉えるために、低レベルの物理的相互作用を超えて進化する必要がある。我々は,大規模言語モデル(LLM)を統合した対話型シミュレーションフレームワークFreeAskWorldを紹介した。我々のフレームワークは、スケーラブルで現実的なヒューマンエージェントシミュレーションをサポートし、多様な実施タスクに適したモジュラーデータ生成パイプラインを含む。このフレームワークを検証するために、従来のビジョン・アンド・ランゲージ・ナビゲーション(VLN)タスクを、エージェントが積極的にナビゲーションガイダンスを探索し解釈できる、インタラクション強化されたディレクティブ・インクイリ(Direction Inquiry)設定に拡張する。我々は,再構成環境,6つの多様なタスクタイプ,16のコアオブジェクトカテゴリ,63,429の注釈付きサンプルフレーム,17時間以上のインタラクションデータからなる大規模ベンチマークデータセットFreeAskWorldを公開し,公開している。オープンループとクローズループの両方の設定で、VLNモデルと人間の参加者をベンチマークします。実験結果から,FreeAskWorldで微調整されたモデルは,従来のモデルよりも優れ,セマンティック理解とインタラクション能力の向上を実現していることがわかった。これらの知見は、高度な高レベル計画とより自然主義的な人間とエージェントの相互作用に向けて、エンボディドAIシステムを前進させるための社会的基盤のシミュレーションフレームワークの有効性を裏付けるものである。重要なことは、インタラクション自体が追加の情報モダリティとして機能する、ということです。

論文の概要: FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI

関連論文リスト