Fugu-MT 論文翻訳(概要): The Collaboration Gap

論文の概要: The Collaboration Gap

arxiv url: http://arxiv.org/abs/2511.02687v1
Date: Tue, 04 Nov 2025 16:10:57 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-05 18:47:06.102334
Title: The Collaboration Gap
Title（参考訳）: コラボレーションギャップ
Authors: Tim R. Davidson, Adam Fourney, Saleema Amershi, Robert West, Eric Horvitz, Ece Kamar,
Abstract要約: i)協調機能を分離し,(ii)問題複雑性を変調し,(iii)スケーラブルな自動グレーディングを可能にし,(iv)出力制約を課さない協調迷路解決ベンチマークを提案する。このフレームワークを用いて、32個のオープンソースおよびクローズドソースモデルを、単独、同種、異種ペアリングで評価する。私たちの結果からは,“コラボレーションギャップ”が明らかになっている。
参考スコア（独自算出の注目度）: 28.553543260404425
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The trajectory of AI development suggests that we will increasingly rely on agent-based systems composed of independently developed agents with different information, privileges, and tools. The success of these systems will critically depend on effective collaboration among these heterogeneous agents, even under partial observability. Despite intense interest, few empirical studies have evaluated such agent-agent collaboration at scale. We propose a collaborative maze-solving benchmark that (i) isolates collaborative capabilities, (ii) modulates problem complexity, (iii) enables scalable automated grading, and (iv) imposes no output-format constraints, preserving ecological plausibility. Using this framework, we evaluate 32 leading open- and closed-source models in solo, homogeneous, and heterogeneous pairings. Our results reveal a "collaboration gap": models that perform well solo often degrade substantially when required to collaborate. Collaboration can break down dramatically; for instance, small distilled models that solve mazes well alone may fail almost completely in certain pairings. We find that starting with the stronger agent often improves outcomes, motivating a "relay inference" approach where the stronger agent leads before handing off to the weaker one, closing much of the gap. Our findings argue for (1) collaboration-aware evaluation, (2) training strategies developed to enhance collaborative capabilities, and (3) interaction design that reliably elicits agents' latent skills, guidance that applies to AI-AI and human-AI collaboration.
Abstract（参考訳）: AI開発の流れは、異なる情報、特権、ツールを備えた独立して開発されたエージェントで構成されるエージェントベースのシステムにますます依存していることを示唆している。これらのシステムの成功は、部分的な可観測性の下でも、これらの異種エージェント間の効果的な協調に批判的に依存する。強い関心にもかかわらず、このようなエージェントエージェントとエージェントのコラボレーションを大規模に評価する実験的な研究はほとんどない。協調迷路解決ベンチマークを提案する。 (i)協調機能を分離する (ii)問題複雑性を調節する。 (iii)スケーラブルな自動階調が可能で、 (iv) 生態学的妥当性を保ちながら、出力形式制約を課さない。このフレームワークを用いて、32個のオープンソースおよびクローズドソースモデルを、単独、同種、異種ペアリングで評価する。私たちの結果からは,“コラボレーションギャップ”が明らかになっている。例えば、迷路をうまく解いた小さな蒸留モデルは、ある種のペアリングでほとんど完全に失敗する可能性がある。より強いエージェントから始めると、しばしば結果を改善し、より強いエージェントがより弱いエージェントに手渡す"リレー推論"アプローチを動機付け、ギャップの大部分を閉じる。本研究は,(1)協調意識評価,(2)協調能力向上のためのトレーニング戦略,(3)エージェントの潜伏スキル,AI-AIや人間-AIのコラボレーションに確実に適用できるインタラクション設計について論じる。

論文の概要: The Collaboration Gap

関連論文リスト