Fugu-MT 論文翻訳(概要): ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning

論文の概要: ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning

arxiv url: http://arxiv.org/abs/2508.09303v1
Date: Tue, 12 Aug 2025 19:38:21 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-14 20:42:00.676879
Title: ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning
Title（参考訳）: ParallelSearch: 強化学習で並列でクエリとサブクエリを分解するためにLLMをトレーニングする
Authors: Shu Zhao, Tan Yu, Anbang Xu, Japinder Singh, Aaditya Shukla, Rama Akkiraju,
Abstract要約: Reasoning-augmented search agent as Search-R1は、外部知識ソースからの多段階情報検索において顕著な能力を示す。既存のアプローチは、本質的に並列化可能で論理的に独立な比較を扱う場合でも、検索クエリを厳格に処理する。並列化可能なクエリ構造を認識し,複数の検索操作を同時に実行する,大規模言語モデルを活用した新しい強化学習フレームワークであるParallelSearchを提案する。
参考スコア（独自算出の注目度）: 20.11646932754985
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reasoning-augmented search agents such as Search-R1, trained via reinforcement learning with verifiable rewards (RLVR), demonstrate remarkable capabilities in multi-step information retrieval from external knowledge sources. These agents address the limitations of their parametric memory by dynamically gathering relevant facts to address complex reasoning tasks. However, existing approaches suffer from a fundamental architectural limitation: they process search queries strictly sequentially, even when handling inherently parallelizable and logically independent comparisons. This sequential bottleneck significantly constrains computational efficiency, particularly for queries that require multiple entity comparisons. To address this critical limitation, we propose ParallelSearch, a novel reinforcement learning framework that empowers large language models (LLMs) to recognize parallelizable query structures and execute multiple search operations concurrently. Our approach introduces dedicated reward functions that incentivize the identification of independent query components while preserving answer accuracy through jointly considering correctness, query decomposition quality, and parallel execution benefits. Comprehensive experiments demonstrate that ParallelSearch outperforms state-of-the-art baselines by an average performance gain of 2.9% across seven question-answering benchmarks. Notably, on parallelizable questions, our method achieves a 12.7% performance improvement while requiring only 69.6% of the LLM calls compared to sequential approaches.
Abstract（参考訳）: Reasoning-augmented search agent such as Search-R1, training via reinforcement learning with verible rewards (RLVR)は、外部知識ソースからの多段階情報検索において顕著な能力を示す。これらのエージェントは、複雑な推論タスクに対処するために、関連する事実を動的に収集することでパラメトリックメモリの限界に対処する。しかし、既存のアプローチは、本質的に並列化可能で論理的に独立な比較を扱う場合でも、検索クエリを厳格に処理する、という基本的なアーキテクチャ上の制限に悩まされている。このシーケンシャルなボトルネックは、特に複数のエンティティ比較を必要とするクエリに対して、計算効率を著しく制限する。この限界に対処するため,並列化可能なクエリ構造を認識し,複数の検索操作を同時に実行する,大規模言語モデル(LLM)を活用した新しい強化学習フレームワークであるParallelSearchを提案する。提案手法では,解答精度を維持しつつ,正しさ,クエリ分解品質,並列実行のメリットを共同で考慮しながら,独立した問合せコンポーネントの識別を動機付ける専用報酬関数を導入する。総合的な実験によると、ParallelSearchは7つの問合せベンチマークで平均2.9%の性能向上により最先端のベースラインを上回っている。特に並列化可能な質問では、逐次的なアプローチに比べて69.6%のLCM呼び出ししか必要とせず、12.7%の性能向上を実現している。

論文の概要: ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning

関連論文リスト