Fugu-MT 論文翻訳(概要): Evolving Excellence: Automated Optimization of LLM-based Agents

論文の概要: Evolving Excellence: Automated Optimization of LLM-based Agents

arxiv url: http://arxiv.org/abs/2512.09108v1
Date: Tue, 09 Dec 2025 20:48:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-11 15:14:53.315913
Title: Evolving Excellence: Automated Optimization of LLM-based Agents
Title（参考訳）: 進化する卓越性: LLMをベースとしたエージェントの自動最適化
Authors: Paul Brookes, Vardan Voskanyan, Rafail Giavrimis, Matthew Truscott, Mina Ilieva, Chrystalla Pavlou, Alexandru Staicu, Manal Adham, Will Evers- Hood, Jingzhi Gong, Kejia Zhang, Matvey Fedoseev, Vishal Sharma, Roman Bauer, Zheng Wang, Hema Nair, Wei Jie, Tianhua Xu, Aurora Constantin, Leslie Kanthan, Michail Basios,
Abstract要約: 我々は、意味論的に認識された遺伝的演算子を通してエージェント構成を協調的に最適化する、ノーコード進化最適化プラットフォームであるARTEMISを提案する。我々は,AtCoder Heuristic Contest 上での競争プログラミングのための emphALE Agent の 4 つの代表的なエージェントシステム上で ARTEMIS を評価する。また、GSM8Kの初等レベルの数学問題に対して、より小さなオープンソースモデル(Qwen2.5-7B)を用いたemphMathTales-Teacher Agentの評価を行い、textbfを実現する。
参考スコア（独自算出の注目度）: 33.81822162934331
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Agentic AI systems built on large language models (LLMs) offer significant potential for automating complex workflows, from software development to customer support. However, LLM agents often underperform due to suboptimal configurations; poorly tuned prompts, tool descriptions, and parameters that typically require weeks of manual refinement. Existing optimization methods either are too complex for general use or treat components in isolation, missing critical interdependencies. We present ARTEMIS, a no-code evolutionary optimization platform that jointly optimizes agent configurations through semantically-aware genetic operators. Given only a benchmark script and natural language goals, ARTEMIS automatically discovers configurable components, extracts performance signals from execution logs, and evolves configurations without requiring architectural modifications. We evaluate ARTEMIS on four representative agent systems: the \emph{ALE Agent} for competitive programming on AtCoder Heuristic Contest, achieving a \textbf{$13.6\%$ improvement} in acceptance rate; the \emph{Mini-SWE Agent} for code optimization on SWE-Perf, with a statistically significant \textbf{10.1\% performance gain}; and the \emph{CrewAI Agent} for cost and mathematical reasoning on Math Odyssey, achieving a statistically significant \textbf{$36.9\%$ reduction} in the number of tokens required for evaluation. We also evaluate the \emph{MathTales-Teacher Agent} powered by a smaller open-source model (Qwen2.5-7B) on GSM8K primary-level mathematics problems, achieving a \textbf{22\% accuracy improvement} and demonstrating that ARTEMIS can optimize agents based on both commercial and local models.
Abstract（参考訳）: 大規模言語モデル(LLM)上に構築されたエージェントAIシステムは、ソフトウェア開発から顧客サポートまで、複雑なワークフローを自動化する上で大きな可能性を秘めている。しかし、LLMエージェントは、最適化されていないプロンプト、ツール記述、通常数週間のマニュアル修正を必要とするパラメータなど、最適でない設定のため、しばしば性能が低下した。既存の最適化手法は、一般的な用途では複雑すぎるか、コンポーネントを分離して扱い、重要な相互依存を欠いている。我々は、意味論的に認識された遺伝的演算子を通してエージェント構成を協調的に最適化する、ノーコード進化最適化プラットフォームであるARTEMISを提案する。ベンチマークスクリプトと自然言語の目標のみを前提として、ARTEMISは自動的に構成可能なコンポーネントを発見し、実行ログからパフォーマンス信号を抽出し、アーキテクチャ変更を必要とせずに構成を進化させる。我々は,AtCoder Heuristic Contest 上での競合プログラミングのための \emph{ALE Agent} と,SWE-Perf 上でのコード最適化のための \emph{Mini-SWE Agent} と,統計的に重要な \textbf{10.1\% 性能ゲインを備えた \emph{Mini-SWE Agent} と,Math Odyssey 上でのコストと数学的推論のための \emph{CrewAI Agent} の4つの代表エージェントシステム上でARTEMIS を評価する。また, GSM8Kの一次レベル数学問題に対して, より小さなオープンソースモデル(Qwen2.5-7B)をベースとした「emph{MathTales-Teacher Agent」の評価を行い, 「textbf{22\%精度改善」を実現し, 商用モデルとローカルモデルの両方に基づいてARTEMISがエージェントを最適化できることを実証した。

論文の概要: Evolving Excellence: Automated Optimization of LLM-based Agents

関連論文リスト