Fugu-MT 論文翻訳(概要): Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

論文の概要: Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

arxiv url: http://arxiv.org/abs/2508.12800v1
Date: Mon, 18 Aug 2025 10:23:10 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-19 14:49:11.240638
Title: Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward
Title（参考訳）: Atom-Searcher: 微粒化原子思考リワードによるエージェントディープリサーチの強化
Authors: Yong Deng, Guoqing Wang, Zhenzhe Ying, Xiaofeng Wu, Jinzhen Lin, Wenwen Xiong, Yuqin Dai, Shuo Yang, Zhanwei Zhang, Qiwen Wang, Yang Qin, Changhua Meng,
Abstract要約: 大規模言語モデル(LLM)は、目覚ましい問題解決能力を示すが、静的な内部知識のために複雑なタスクに苦しむ。エージェントディープリサーチの最近の進歩は、LSMに自律的に情報を分析し、検索し、合成する権限を与えている。我々はまず、推論を微粒な機能単位に分解する新しいLCM思考パラダイムであるAtomic Thoughtを提案する。そこで我々は,Atom ThoughtとATRを統合したエージェントディープリサーチのための新しいRLフレームワークAtom-Searcherを提案する。
参考スコア（独自算出の注目度）: 24.061532713208813
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) exhibit remarkable problem-solving abilities, but struggle with complex tasks due to static internal knowledge. Retrieval-Augmented Generation (RAG) enhances access to external information, yet remains limited in multi-hop reasoning and strategic search due to rigid workflows. Recent advancements in agentic deep research empower LLMs to autonomously reason, search, and synthesize information. However, current approaches relying on outcome-based reinforcement learning (RL) face critical issues such as conflicting gradients and reward sparsity, limiting performance gains and training efficiency. To address these, we first propose Atomic Thought, a novel LLM thinking paradigm that decomposes reasoning into fine-grained functional units. These units are supervised by Reasoning Reward Models (RRMs), which provide Atomic Thought Rewards (ATR) for fine-grained guidance. Building on this, we propose Atom-Searcher, a novel RL framework for agentic deep research that integrates Atomic Thought and ATR. Atom-Searcher uses a curriculum-inspired reward schedule, prioritizing process-level ATR early and transitioning to outcome rewards, accelerating convergence on effective reasoning paths. Experiments on seven benchmarks show consistent improvements over the state-of-the-art. Key advantages include: (1) Atom-Searcher scales computation at test-time. (2) Atomic Thought provides supervision anchors for RRMs, bridging deep research tasks and RRMs. (3) Atom-Searcher exhibits more interpretable, human-like reasoning patterns.
Abstract（参考訳）: 大規模言語モデル(LLM)は、目覚ましい問題解決能力を示すが、静的な内部知識のために複雑なタスクに苦しむ。 Retrieval-Augmented Generation (RAG)は、外部情報へのアクセスを促進するが、厳格なワークフローによるマルチホップ推論と戦略的検索には制限がある。エージェントディープリサーチの最近の進歩は、LSMに自律的に情報を分析し、検索し、合成する権限を与えている。しかしながら、結果に基づく強化学習(RL)に依存する現在のアプローチは、グラデーションの矛盾や報酬の分散、パフォーマンス向上の制限、トレーニング効率の向上といった重要な問題に直面している。これらの問題に対処するために、我々はまず、推論を微細な機能単位に分解する新しいLLM思考パラダイムであるAtomic Thoughtを提案する。これらのユニットはReasoning Reward Models (RRMs) によって管理されており、微粒なガイダンスのためにAtomic Thought Rewards (ATR) を提供する。そこで我々は,Atom ThoughtとATRを統合したエージェントディープリサーチのための新しいRLフレームワークAtom-Searcherを提案する。 Atom-Searcherはカリキュラムにインスパイアされた報酬スケジュールを使用し、プロセスレベルのATRを早期に優先順位付けし、結果の報酬に遷移し、効果的な推論パスへの収束を加速する。 7つのベンチマークの実験では、最先端よりも一貫した改善が見られた。主な利点は以下のとおりである。 1) Atom-Searcherはテスト時に計算をスケールする。 2)Atomic Thoughtは、RRMの監督アンカーを提供し、深い研究タスクとRRMをブリッジする。 (3) Atom-Searcherは、より解釈可能な、人間のような推論パターンを示す。

論文の概要: Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

関連論文リスト