Fugu-MT 論文翻訳(概要): Ask When It Pays: Cost-Aware Open-Ended Interaction for Instance Goal Navigation

論文の概要: Ask When It Pays: Cost-Aware Open-Ended Interaction for Instance Goal Navigation

arxiv url: http://arxiv.org/abs/2606.03175v2
Date: Wed, 03 Jun 2026 03:34:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 17:40:41.620465
Title: Ask When It Pays: Cost-Aware Open-Ended Interaction for Instance Goal Navigation
Title（参考訳）: 支払いのタイミングを尋ねる:インスタンスゴールナビゲーションのためのコスト対応のオープンエンドインタラクション
Authors: Xunyi Zhao, Sihao Lin, Gengze Zhou, Zerui Li, Shijie Li, Wei Tao, Jiajun Liu, Qi Wu,
Abstract要約: インスタンスゴールナビゲーション(IGN)では、具体化されたエージェントが、未指定の自然言語記述からイントラクタ内の特定のインスタンスオブジェクトを見つける必要がある。我々はIGNをコストに敏感な不確実性推論問題として再考する。
参考スコア（独自算出の注目度）: 33.89987823594985
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Instance Goal Navigation (IGN) requires an embodied agent to find a specific object instance among distractors from an under-specified natural-language description. Such ambiguity often cannot be resolved from perception and language alone, making interaction with an oracle a natural mechanism for disambiguation. Prior interactive methods allow oracle queries but treat lightweight clarification and route-level guidance alike, letting agents boost success rate through repeated high-information questions rather than by resolving the underlying ambiguity efficiently. We recast interactive IGN as a cost-sensitive uncertainty-reduction problem, where the agent should ask the question whose answer provides the largest reduction in navigation uncertainty relative to its penalty. To this end, we apply an information-gain analysis on existing navigation corpora to identify which cues reduce navigation uncertainty, yielding a compact set of question types and data-derived weights. However, existing interactive navigation benchmarks do not model the cost of different question types or evaluate how efficiently agents use interaction, making them unsuitable for studying cost-sensitive interaction. Based on this taxonomy, we construct a benchmark for diagnosing interaction behavior and efficiency, together with a Weighted Success Rate metric that penalizes each query by its derived cost. We further propose a zero-shot MLLM navigator that selectively queries at each decision step only when the expected uncertainty reduction justifies the interaction cost.
Abstract（参考訳）: インスタンスゴールナビゲーション(IGN)では、具体化されたエージェントが、未指定の自然言語記述からイントラクタ内の特定のオブジェクトインスタンスを見つける必要がある。このような曖昧さは知覚と言語だけでは解決できないことが多く、神託との相互作用は曖昧さの自然なメカニズムである。従来の対話的な方法では、オラクルクエリを扱えるが、軽量な明確化やルートレベルのガイダンスも扱えるため、エージェントは、基礎となる曖昧さを効率的に解決するのではなく、繰り返し高情報質問を通じて成功率を高めることができる。本稿では,対話型IGNをコスト感受性の不確実性推論問題として再考する。この目的のために,既存のナビゲーションコーパスに情報ゲイン分析を適用し,ナビゲーションの不確実性を低減し,質問型とデータ由来の重みのコンパクトなセットを生成する。しかし、既存の対話型ナビゲーションベンチマークは、異なる質問型のコストをモデル化したり、エージェントがいかに効率的にインタラクションを使用するかを評価したりしないため、コスト感受性の相互作用を研究するには適さない。この分類に基づいて,対話行動と効率の診断のためのベンチマークを構築し,各クエリをそのコストでペナライズする重み付き成功率指標を構築した。さらに,提案するゼロショットMLLMナビゲータは,期待される不確実性低減が相互作用コストを正当化する場合にのみ,各決定ステップで選択的にクエリを行う。

論文の概要: Ask When It Pays: Cost-Aware Open-Ended Interaction for Instance Goal Navigation

関連論文リスト