Fugu-MT 論文翻訳(概要): Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models

論文の概要: Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models

arxiv url: http://arxiv.org/abs/2604.18612v1
Date: Tue, 14 Apr 2026 07:35:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-22 22:41:49.355788
Title: Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models
Title（参考訳）: Agent-GWO:大規模言語モデルにおける動的プロンプト最適化のための協調エージェント
Authors: Xudong Wang, Chaoning Zhang, Chenghao Li, Shuxu Chen, Qigan Sun, Jiaquan Zhang, Fachrina Dewi Puspitasari, Tae-Ho Kim, Jiwei Wei, Malu Zhang, Guoqing Wang, Yang Yang, Heng Tao Shen,
Abstract要約: Agent-GWOは複雑な推論のための動的プロンプト最適化フレームワークである。本稿では,Agent-GWOが既存のプロンプト最適化手法よりも精度と安定性を一貫して向上することを示す。
参考スコア（独自算出の注目度）: 69.55139736609367
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated strong capabilities in complex reasoning tasks, while recent prompting strategies such as Chain-of-Thought (CoT) have further elevated their performance in handling complex logical problems. Despite these advances, high-quality reasoning remains heavily reliant on manual static prompts and is sensitive to decoding configurations and task distributions, leading to performance fluctuations and limited transferability. Existing automatic prompt optimization methods typically adopt single-agent local search, failing to simultaneously optimize prompts and decoding hyperparameters within a unified framework to achieve stable global improvements. To address this limitation, we propose Agent-GWO, a dynamic prompt optimization framework for complex reasoning. Specifically, we unify prompt templates and decoding hyperparameters as inheritable agent configurations. By leveraging the leader-follower mechanism of the Grey Wolf Optimizer (GWO), we automatically select three leader agents ($α$, $β$, and $δ$) to guide the collaborative updates of the remaining agents, enabling iterative convergence toward robust optimal reasoning configurations that can be seamlessly integrated for inference. Extensive experiments on multiple mathematical and hybrid reasoning benchmarks across diverse LLM backbones show that Agent-GWO consistently improves accuracy and stability over existing prompt optimization methods. The code will be released publicly.
Abstract（参考訳）: 大規模言語モデル(LLM)は、複雑な推論タスクにおいて強力な能力を示し、最近のChain-of-Thought(CoT)のようなプロンプト戦略は、複雑な論理問題を扱う際のパフォーマンスをさらに高めている。これらの進歩にもかかわらず、高品質な推論は手動の静的なプロンプトに大きく依存しており、デコード構成やタスクの分散に敏感であり、パフォーマンスの変動と限られた転送可能性をもたらす。既存の自動プロンプト最適化手法は、通常は単一エージェントのローカルサーチを採用しており、安定したグローバルな改善を実現するために、統合されたフレームワーク内でプロンプトとハイパーパラメータの復号を同時に最適化することができない。この制限に対処するため,複雑な推論のための動的プロンプト最適化フレームワークであるAgent-GWOを提案する。具体的には、プロンプトテンプレートとハイパーパラメータを継承可能なエージェント構成としてデコードする。 Grey Wolf Optimizer(GWO)のリーダ・フォロワー機構を活用することで、3つのリーダエージェント(α$,$β$, $δ$)を自動的に選択し、残りのエージェントの協調的な更新をガイドし、推論にシームレスに統合可能な堅牢な最適推論構成への反復収束を可能にする。多様なLCMバックボーンにまたがる複数の数学的およびハイブリッド推論ベンチマークの広範な実験により、エージェント-GWOは既存のプロンプト最適化手法よりも精度と安定性を一貫して向上することが示された。コードは公開されます。

論文の概要: Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models

関連論文リスト