Fugu-MT 論文翻訳(概要): Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search

論文の概要: Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search

arxiv url: http://arxiv.org/abs/2510.26287v1
Date: Thu, 30 Oct 2025 09:10:36 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-31 16:05:09.733921
Title: Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search
Title（参考訳）: モンテカルロ木探索による強化学習に基づくRepoQA-Agentの強化
Authors: Guochang Li, Yuchen Liu, Zhen Qin, Yunkun Wang, Jianping Zhong, Chen Zhi, Binhua Li, Fei Huang, Yongbin Li, Shuiguang Deng,
Abstract要約: モンテカルロ木探索によるエージェント強化学習フレームワークRepoSearch-R1を紹介する。 RepoSearch-R1に基づいて,リポジトリ質問応答タスク用に設計されたRepoQA-Agentを構築する。
参考スコア（独自算出の注目度）: 70.63903518295785
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Repository-level software engineering tasks require large language models (LLMs) to efficiently navigate and extract information from complex codebases through multi-turn tool interactions. Existing approaches face significant limitations: training-free, in-context learning methods struggle to guide agents effectively in tool utilization and decision-making based on environmental feedback, while training-based approaches typically rely on costly distillation from larger LLMs, introducing data compliance concerns in enterprise environments. To address these challenges, we introduce RepoSearch-R1, a novel agentic reinforcement learning framework driven by Monte-carlo Tree Search (MCTS). This approach allows agents to generate diverse, high-quality reasoning trajectories via self-training without requiring model distillation or external supervision. Based on RepoSearch-R1, we construct a RepoQA-Agent specifically designed for repository question-answering tasks. Comprehensive evaluation on repository question-answering tasks demonstrates that RepoSearch-R1 achieves substantial improvements of answer completeness: 16.0% enhancement over no-retrieval methods, 19.5% improvement over iterative retrieval methods, and 33% increase in training efficiency compared to general agentic reinforcement learning approaches. Our cold-start training methodology eliminates data compliance concerns while maintaining robust exploration diversity and answer completeness across repository-level reasoning tasks.
Abstract（参考訳）: リポジトリレベルのソフトウェアエンジニアリングタスクは、多ターンツールインタラクションを通じて複雑なコードベースから情報を効率的にナビゲートし抽出するために、大きな言語モデル(LLM)を必要とする。既存の学習手法は、環境フィードバックに基づくツール利用や意思決定においてエージェントを効果的に導くのに苦労するが、トレーニングベースのアプローチは通常、より大きなLCMからの高価な蒸留に依存し、エンタープライズ環境にデータコンプライアンスの懸念を導入する。これらの課題に対処するために,モンテカルロ木探索(MCTS)によって駆動される新しいエージェント強化学習フレームワークであるRepoSearch-R1を紹介する。このアプローチにより、モデル蒸留や外部監督を必要とせず、自己学習を通じて多種多様な高品質な推論軌道を生成することができる。 RepoSearch-R1に基づいて,リポジトリ質問応答タスク用に設計されたRepoQA-Agentを構築する。 RepoSearch-R1は、検索不要な手法よりも16.0%向上し、反復的検索法より19.5%改善し、一般的なエージェント強化学習手法と比較して33%のトレーニング効率向上を実現している。我々のコールドスタートトレーニング手法は、リポジトリレベルの推論タスクにおける堅牢な探索の多様性と完全性を維持しながら、データコンプライアンスの懸念を排除します。

論文の概要: Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search

関連論文リスト