Fugu-MT 論文翻訳(概要): Multi-Agent Computer Use

論文の概要: Multi-Agent Computer Use

arxiv url: http://arxiv.org/abs/2606.01533v1
Date: Mon, 01 Jun 2026 01:29:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:29.773433
Title: Multi-Agent Computer Use
Title（参考訳）: マルチエージェントコンピュータの利用
Authors: Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried,
Abstract要約: 我々はマルチエージェント・コンピュータ・ユース(MACU)システムの評価・構築に向けて進むべきであると論じる。本稿では、マネージャモデルがコンピュータ使用タスクを有向非巡回グラフ(DAG)として分解する汎用マルチエージェント構成を提案する。各イテレーションで、マネージャは並列CUAサブエージェントをディスパッチし、DAGの準備ができているフロンティアでノードを実行する。
参考スコア（独自算出の注目度）: 72.79887808312706
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computer use agents (CUAs) today are primarily deployed as single serial agents. This setup is suboptimal for complex long-horizon tasks that benefit from task decomposition, parallel execution, and consistent re-planning based on new information. In this paper, we argue that we should instead move towards evaluating and building multi-agent computer use (MACU) systems. These systems, which emphasize planning and parallel execution, alleviate many of the shortcomings of single-agent CUAs. We propose a general multi-agent setup in which a manager model decomposes computer use tasks as a directed acyclic graph (DAG), encoding relevant dependencies and goals for subagents. At each iteration, the manager dispatches parallel CUA subagents to carry out nodes on the ready frontier of the DAG, and continuously revises the DAG (adding, canceling, or rewriting nodes) as new findings arrive from subagents. This design treats the partially observable environment of computer use as a first class challenge: information that downstream agents may not be able to re-observe are retained and passed forward through the manager and DAG structure. We demonstrate that MACU consistently improves over strong single-agent baselines by $3.4-25.5\%$ on desktop (OSWorld) and web navigation (Online-Mind2Web, WebTailBench, Odysseys) benchmarks, exhibits more favorable test-time scaling, and solves complex long-horizon tasks where single-agent CUAs get stuck. On Odysseys, a long-horizon web navigation benchmark, MACU improves average task completion wall-clock time by ${\sim} 1.5 \times$, demonstrating its efficacy in speeding up traditionally slow CUA pipelines. Our findings highlight that multi-agent coordination is a promising axis for scaling computer use agents to work productively for longer and more effectively. We release all code and interactive visualizations at https://jykoh.com/multi-agent-computer-use.
Abstract（参考訳）: コンピュータ利用エージェント(CUA)は、主に単一のシリアルエージェントとしてデプロイされている。このセットアップは、タスクの分解、並列実行、新しい情報に基づいた一貫した再計画の恩恵を受ける複雑な長期タスクに最適である。本稿では,マルチエージェント・コンピュータ・ユース(MACU)システムの評価・構築に向けて進むべきであると論じる。計画と並列実行を重視したこれらのシステムは、単一エージェントCUAの欠点の多くを軽減する。本稿では,コンピュータ利用タスクを有向非巡回グラフ(DAG)として分解し,サブエージェントの依存関係と目標を符号化する汎用マルチエージェント構成を提案する。各イテレーションで、マネージャは並列CUAサブエージェントを発行し、DAGの準備ができているフロンティアでノードを実行する。この設計は、コンピュータ使用の部分的に観測可能な環境を第一級課題として扱い、下流のエージェントが再観測できない情報を保持し、マネージャとDAG構造を通して転送する。私たちは、MACUがデスクトップ(OSWorld)とWebナビゲーション(Online-Mind2Web、WebTailBench、Odysseys)のベンチマークで、強力なシングルエージェントベースラインよりも一貫して3.4-25.5\%の値で改善していることを示し、より良好なテストタイムスケーリングを示し、シングルエージェントCUAsが立ち往生する複雑なロングホライゾンタスクを解決する。長期にわたるWebナビゲーションベンチマークであるOdysseysでは、MACUが平均タスク完了ウォールクロック時間を${\sim} 1.5 \times$で改善し、従来の遅いCUAパイプラインを高速化する効果を示している。この結果から, マルチエージェント協調は, コンピュータ使用エージェントを長期的, より効果的にスケールする上で有望な軸であることを示唆した。コードとインタラクティブな視覚化はhttps://jykoh.com/multi-agent- computer-useで公開しています。

論文の概要: Multi-Agent Computer Use

関連論文リスト