Fugu-MT 論文翻訳(概要): Prompt Optimization via Retrieved Reasoning Assets and Multi-Agent Analysis

論文の概要: Prompt Optimization via Retrieved Reasoning Assets and Multi-Agent Analysis

arxiv url: http://arxiv.org/abs/2510.16635v1
Date: Sat, 18 Oct 2025 20:21:09 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 00:56:39.074683
Title: Prompt Optimization via Retrieved Reasoning Assets and Multi-Agent Analysis
Title（参考訳）: Retrieved Reasoning Assetsによるプロンプト最適化とマルチエージェント解析
Authors: Wonduk Seo, Juhyeon Lee, Junseo Koh, Hyunjin An, Jian Park, Seunghyun Lee, Haihua Chen, Yi Bu,
Abstract要約: スコア・アウェア・プロンプト最適化のためのマルチエージェントフレームワークであるMA-SAPOを紹介する。従来の手法と比較して、MA-SAPOは、体系的な編集を導く構造的推論と評価結果を明示的に結合する。評価信号を解釈可能な推論連鎖に変換することで、MA-SAPOはより透明で、監査可能で、制御可能な、迅速な改善を生成する。
参考スコア（独自算出の注目度）: 5.935239028627343
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Prompt optimization has emerged as an effective alternative to retraining for improving the performance of Large Language Models (LLMs). However, most existing approaches treat evaluation as a black box, relying solely on numerical scores while offering limited insight into why a prompt succeeds or fails. They also depend heavily on trial-and-error refinements, which are difficult to interpret and control. In this paper, we introduce MA-SAPO, a Multi-Agent framework for Score-Aware Prompt Optimization. Compared to prior methods, MA-SAPO explicitly couples evaluation outcomes with structured reasoning to guide systematic edits. The framework specifically consists of two stages: during the Reasoning Phase, agents collaboratively explain metric scores, diagnose weaknesses, and synthesize targeted refinements that are stored as reusable reasoning assets; during the Test Phase, agents retrieve these assets to analyze optimized prompts and apply only evidence-grounded edits. By turning evaluation signals into interpretable reasoning chains, MA-SAPO produces prompt refinements that are more transparent, auditable, and controllable. Experiments on the HelpSteer1/2 benchmarks demonstrate consistent improvements over single-pass prompting, retrieval-augmented baselines, and prior multi-agent strategies, validating the effectiveness of our approach.
Abstract（参考訳）: プロンプト最適化は、LLM(Large Language Models)の性能向上のためのリトレーニングの効果的な代替手段として登場した。しかし、既存のほとんどのアプローチはブラックボックスとしての評価を扱い、数値的なスコアのみに頼りながら、なぜプロンプトが成功するか、失敗するかについての限られた洞察を与えている。それらはまた、解釈と制御が難しい試行錯誤の改良にも大きく依存している。本稿では,Score-Aware Prompt OptimizationのためのマルチエージェントフレームワークであるMA-SAPOを紹介する。従来の手法と比較して、MA-SAPOは、体系的な編集を導く構造的推論と評価結果を明示的に結合する。このフレームワークは特に2つの段階で構成されている: 推論フェーズの間、エージェントはメトリクススコアを共同で説明し、弱点を診断し、再利用可能な推論資産として格納されるターゲットリファインメントを合成する。評価信号を解釈可能な推論連鎖に変換することで、MA-SAPOはより透明で、監査可能で、制御可能な、迅速な改善を生成する。 HelpSteer1/2ベンチマークの実験では、シングルパスプロンプト、検索拡張ベースライン、および事前マルチエージェント戦略に対する一貫した改善が示され、このアプローチの有効性が検証された。

論文の概要: Prompt Optimization via Retrieved Reasoning Assets and Multi-Agent Analysis

関連論文リスト