Fugu-MT 論文翻訳(概要): QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

論文の概要: QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

arxiv url: http://arxiv.org/abs/2605.24218v1
Date: Fri, 22 May 2026 20:59:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:17.755485
Title: QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks
Title（参考訳）: QUEST: 完全合成タスクによるフロンティアディープリサーチエージェントのトレーニング
Authors: Jian Xie, Tianhe Lin, Zilu Wang, Yuting Ning, Yuekun Yao, Tianci Xue, Zhehao Zhang, Zhongyang Li, Kai Zhang, Yufan Wu, Shijie Chen, Boyu Gou, Mingzhe Han, Yifei Wang, Vint Lee, Xinpeng Wei, Xiangjun Wang, Yu Su, Huan Sun,
Abstract要約: QUESTは、様々な長距離検索タスクを扱うために設計されたオープンモデルのファミリーである。本研究では,中等教育,教師付き微調整,強化学習を組み合わせた効果的なトレーニングレシピを提案する。 QUESTには、効果的なロングホライズン推論と知識合成を可能にするコンテキスト管理機構が組み込まれている。
参考スコア（独自算出の注目度）: 38.454776684977496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep research agents extend the role of search engines from retrieving keyword-matched pages to synthesizing knowledge, fundamentally changing how humans interact with information. However, frontier systems remain proprietary, while existing open agents often generalize poorly across different task types, leaving unclear how to train a broadly capable deep research agent. We release QUEST, a family of open models (ranging from 2B to 35B) that serve as general-purpose deep research agents designed to handle a wide range of long-horizon search tasks, with strong capabilities in fact seeking, citation grounding, and report synthesis. To build QUEST, we propose an effective training recipe combining mid-training, supervised fine-tuning, and reinforcement learning. Central to this recipe is a curated data synthesis pipeline based on unified rubric trees, which applies to different task types and enables synthesizing training data with verifiable rewards without human annotation. In addition, QUEST incorporates a built-in context management mechanism that enables effective long-horizon reasoning and knowledge synthesis. Using only 8K synthesized tasks, QUEST approaches or even surpasses frontier closed-source agents across eight deep research benchmarks spanning diverse task types, and achieves the best overall performance among recent open-weight agents. We released everything: models, data, and training scripts.
Abstract（参考訳）: ディープリサーチエージェントは、検索エンジンの役割を、キーワードマッチングされたページの検索から知識の合成まで拡張し、人間が情報とどのように相互作用するかを根本的に変える。しかしながら、フロンティアシステムはプロプライエタリなままであり、既存のオープンエージェントは様々なタスクタイプにまたがってあまり一般化しておらず、広く有能なディープリサーチエージェントの訓練方法が不明なままである。 QUESTはオープンモデル(2Bから35Bまで)のファミリーで、多岐にわたる長距離探索タスクを扱うための汎用的なディープリサーチエージェントとして機能し、実際に検索、引用グラウンド、レポート合成といった強力な機能を備えています。 QUESTを構築するために,中等教育,教師付き微調整,強化学習を組み合わせた効果的なトレーニングレシピを提案する。このレシピの中心は、統一されたルーブリックツリーに基づくキュレートされたデータ合成パイプラインであり、異なるタスクタイプに適用され、人間のアノテーションなしで検証可能な報酬でトレーニングデータを合成することができる。さらに、QUESTには、効果的な長期的推論と知識合成を可能にするコンテキスト管理機構が組み込まれている。たった8Kの合成タスクを使用して、QUESTは、さまざまなタスクタイプにまたがる8つのディープリサーチベンチマークにおいて、フロンティアクローズソースエージェントにアプローチするか、あるいは超越している。モデル、データ、トレーニングスクリプトなど、すべてをリリースしました。

論文の概要: QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

関連論文リスト