Fugu-MT 論文翻訳(概要): APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection

論文の概要: APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection

arxiv url: http://arxiv.org/abs/2606.11459v1
Date: Tue, 09 Jun 2026 21:22:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-11 16:42:38.185401
Title: APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection
Title（参考訳）: APEX: 動的データ選択を備えた自動プロンプトエンジニアリングeXpert
Authors: Fei Wang, Si Si, Cho-Jui Hsieh, Inderjit S. Dhillon,
Abstract要約: 大規模言語モデルは、迅速な定式化に非常に敏感であり、その潜在能力を最大限に活用するためには、自動的なプロンプト最適化が必要である。現在の手法では、開発データセットを静的なベンチマークとして扱い、非形式的なデータに対するかなりの計算予算を浪費している。本稿では,APEX(Automatic Prompt Engineering eXpert)について紹介する。
参考スコア（独自算出の注目度）: 60.504476571531
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models are highly sensitive to prompt formulation, necessitating automatic prompt optimization to unlock their full potential. While evolutionary algorithms have emerged as the dominant paradigm, they suffer from a critical bottleneck: data efficiency. Current methods treat the development dataset as a static benchmark, wasting significant compute budget on uninformative data. In this work, we introduce APEX (Automatic Prompt Engineering eXpert), a novel framework that optimizes the data usage alongside the prompt search. APEX dynamically stratifies the dataset into Easy, Hard, and Mixed tiers based on the optimization lineage. By prioritizing the Mixed tier, which identifies the data where the LLM has mixed performance, we identify two high-leverage subsets: the addressable frontier for generating informative mutations and the rank-sensitive frontier for distinguishing candidate quality. We evaluate APEX across three diverse benchmarks: IFBench, SimpleQA Verified, and FACTS Grounding. Under a fixed budget of 5,000 evaluation calls, due to its data efficiency, APEX outperforms the initial prompt by an average of 11.2% on Gemini 2.5 Flash and 6.8% on Gemma 3 27B, demonstrating that a data-centric approach is key to efficient and effective prompt optimization.
Abstract（参考訳）: 大規模言語モデルは、迅速な定式化に非常に敏感であり、その潜在能力を最大限に活用するためには、自動的なプロンプト最適化が必要である。進化的アルゴリズムが支配的なパラダイムとして登場したが、それらは重要なボトルネック、すなわちデータ効率に悩まされている。現在の手法では、開発データセットを静的なベンチマークとして扱い、非形式的なデータに対するかなりの計算予算を浪費している。本稿では,データ利用を最適化する新しいフレームワークであるAPEX(Automatic Prompt Engineering eXpert)を紹介する。 APEXは最適化の系統に基づいてデータセットをイージー、ハード、ミックスの各層に動的に階層化する。 LLMが混合性能を持つデータを特定するMixedティアを優先順位付けすることにより、情報突然変異を生成するアドレス可能なフロンティアと、候補品質を識別するランクセンシティブフロンティアの2つのハイレベレッジサブセットを識別する。 IFBench、SimpleQA Verified、FACTS Groundingの3つのベンチマークでAPEXを評価した。データ効率のため、5,000件の評価コールの固定予算の下で、APEXはGemini 2.5 Flashで平均11.2%、Gemma 3 27Bで6.8%、データ中心のアプローチが効率的かつ効果的なプロンプト最適化の鍵であることを実証した。

論文の概要: APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection

関連論文リスト