Fugu-MT 論文翻訳(概要): Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

論文の概要: Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

arxiv url: http://arxiv.org/abs/2603.28376v1
Date: Mon, 30 Mar 2026 12:42:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-31 23:18:45.397093
Title: Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design
Title（参考訳）: Marco DeepResearch: 検証中心設計による効率的なディープリサーチエージェントのロック解除
Authors: Bin Zhu, Qianghuai Jia, Tian Lan, Junyang Ren, Feng Gu, Feihu Jiang, Longyue Wang, Zhao Xu, Weihua Luo,
Abstract要約: Marco DeepResearchは、検証中心のフレームワーク設計を3段階に最適化したディープリサーチエージェントである。本稿では,質問の難易度を制御するために,グラフベースおよびエージェントベースQA合成に検証機構を導入する。実験軌道に明示的な検証パターンを注入する検証駆動合成軌道法を設計する。 Marco DeepResearch自体を推論時に検証として使用し、課題に対するパフォーマンスを効果的に向上する。
参考スコア（独自算出の注目度）: 39.31356016375221
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep research agents autonomously conduct open-ended investigations, integrating complex information retrieval with multi-step reasoning across diverse sources to solve real-world problems. To sustain this capability on long-horizon tasks, reliable verification is critical during both training and inference. A major bottleneck in existing paradigms stems from the lack of explicit verification mechanisms in QA data synthesis, trajectory construction, and test-time scaling. Errors introduced at each stage propagate downstream and degrade the overall agent performance. To address this, we present Marco DeepResearch, a deep research agent optimized with a verification-centric framework design at three levels: \textbf{(1)~QA Data Synthesis:} We introduce verification mechanisms to graph-based and agent-based QA synthesis to control question difficulty while ensuring answers are unique and correct; \textbf{(2)~Trajectory Construction:} We design a verification-driven trajectory synthesis method that injects explicit verification patterns into training trajectories; and \textbf{(3)~Test-time scaling:} We use Marco DeepResearch itself as a verifier at inference time and effectively improve performance on challenging questions. Extensive experimental results demonstrate that our proposed Marco DeepResearch agent significantly outperforms 8B-scale deep research agents on most challenging benchmarks, such as BrowseComp and BrowseComp-ZH. Crucially, under a maximum budget of 600 tool calls, Marco DeepResearch even surpasses or approaches several 30B-scale agents, like Tongyi DeepResearch-30B.
Abstract（参考訳）: ディープリサーチエージェントは、オープンエンドの調査を自律的に行い、複雑な情報検索と多段階の推論を統合して、現実世界の問題を解決する。この能力を長期のタスクで維持するためには、トレーニングと推論の両方において信頼性の高い検証が重要である。既存のパラダイムにおける大きなボトルネックは、QAデータ合成、軌道構築、テスト時間スケーリングにおける明確な検証メカニズムの欠如にある。各段階で導入されたエラーは下流に伝播し、全体のエージェント性能を低下させる。この問題を解決するために,検証中心のフレームワーク設計に最適化されたディープリサーチエージェントであるMarco DeepResearchを紹介した。 \textbf{(1)~QAデータ合成:} グラフベースおよびエージェントベースのQA合成に検証機構を導入し,回答がユニークで正しいことを保証しながら質問の難易度を制御する; \textbf{(2)~Trajectory Construction:} 検証駆動トラジェクトリ合成法を設計し,トレーニングトラジェクトリに明示的な検証パターンを注入する; \textbf{(3)~Test-time Scaling:} マルコディープリサーチ自体を推論時に検証器として使用し,課題に対する効果的な性能向上を図る。提案するMarco DeepResearchエージェントは,BrowseCompやBrowseComp-ZHなど,最も困難なベンチマークにおいて,8Bスケールのディープリサーチエージェントよりも優れていた。重要なことに、最大600件のツールコールの予算の下で、Marco DeepResearchはTongyi DeepResearch-30Bなど、30B規模のエージェントを抜いたり、接近させたりさえしている。

論文の概要: Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

関連論文リスト