Fugu-MT 論文翻訳(概要): Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue

論文の概要: Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue

arxiv url: http://arxiv.org/abs/2605.12920v2
Date: Sat, 16 May 2026 03:44:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:45.798357
Title: Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue
Title（参考訳）: 対話による世界モデル調整によるマルチエージェントコーディネーション
Authors: Vardhan Dongre, Dilek Hakkani-Tür,
Abstract要約: コミュニケーションは、エージェントが観察を共有し、彼らの世界モデルを調整することによって、このギャップを埋めることができます。協調型家庭用ロボティクスのベンチマークであるPartinNRを自然言語対話チャネルで拡張し、部分的可観測性を持つ2つのエージェント間の通信を可能にした。実験の結果,対話は40～83ポイントのアクションコンフリクトを減少させるが,サイレントコーディネートに対してタスク成功を低下させることがわかった。
参考スコア（独自算出の注目度）: 9.790389620810933
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Effective collaboration between embodied agents requires more than acting in a shared environment; it demands communication grounded in each agent's evolving understanding of the world. When agents can only partially observe their surroundings, coordination without communication is provably hard, but communication can, in principle, bridge this gap by allowing agents to share observations and align their world models. In this work, we examine whether LLM-based embodied agents actually realize the ability to communicate. We extend PARTNR, a benchmark for collaborative household robotics, with a natural-language dialogue channel that enables two agents with partial observability to communicate during task execution. To evaluate whether dialogue leads to genuine world-model alignment rather than superficial coordination, we propose a framework for measuring world-model alignment defined over per-agent world graphs: observation convergence (do private world models align over time?), information novelty (do messages convey what the partner lacks?), and belief-sensitive messaging (do agents model what their partner knows?). Our experiments across three LLMs reveal that dialogue reduces action conflicts 40 to 83 percentage points but degrades task success relative to silent coordination. Using our metrics, we characterize the gap between superficial coordination and genuine world-model alignment, and identify where current models fall on this spectrum.
Abstract（参考訳）: 具体的エージェント間の効果的なコラボレーションは、共有環境での行動以上のものを必要とし、各エージェントの世界の進化的理解に根ざしたコミュニケーションを要求する。エージェントが周囲を部分的にしか観察できない場合、コミュニケーションのない調整は間違いなく難しいが、コミュニケーションは原則として、エージェントが観察を共有し、世界モデルを整列させることによって、このギャップを埋めることができる。本研究では,LLMをベースとしたエンボディエージェントが実際にコミュニケーション能力を実現するかどうかを検討する。協調型家庭用ロボットのベンチマークであるPartinNRを自然言語対話チャネルで拡張し、タスク実行中に部分観測可能性を持つ2つのエージェントが通信できるようにした。本研究では,対話が表面的な協調よりも真の世界モデルアライメントに繋がるかどうかを評価するために,観察収束(プライベートワールドモデルは時間とともに整列するのか),情報ノベルティ(メッセージはパートナーに欠けているものを伝えるのか),信念に敏感なメッセージング(エージェントはパートナーが知っているものをモデル化するのか?)という,エージェントごとの世界モデルアライメントを測定する枠組みを提案する。 3つのLDMを対象とした実験の結果,対話は40～83ポイントの動作競合を減少させるが,無声協調よりもタスク成功を低下させることがわかった。計測値を用いて、表面調整と真の世界モデルアライメントのギャップを特徴づけ、現在のモデルがこのスペクトルのどこに落ちるかを特定する。

論文の概要: Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue

関連論文リスト