Fugu-MT 論文翻訳(概要): DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization

論文の概要: DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization

arxiv url: http://arxiv.org/abs/2510.18257v1
Date: Tue, 21 Oct 2025 03:28:53 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:12.840359
Title: DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization
Title（参考訳）: DelvePO: フレキシブルプロンプト最適化のための方向誘導型自己進化フレームワーク
Authors: Tao Tao, Guanghui Zhu, Lang Guo, Hongyi Chen, Chunfeng Yuan, Yihua Huang,
Abstract要約: 自己進化的な方法でプロンプトを最適化するタスク非依存のフレームワークを提案する。私たちのフレームワークでは、異なる要因が様々なタスクに与える影響を調べるために、プロンプトを異なるコンポーネントに分離します。 DelvePOは、同じ実験環境下で、従来のSOTAメソッドを一貫して上回っている。
参考スコア（独自算出の注目度）: 24.65474871019772
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prompt Optimization has emerged as a crucial approach due to its capabilities in steering Large Language Models to solve various tasks. However, current works mainly rely on the random rewriting ability of LLMs, and the optimization process generally focus on specific influencing factors, which makes it easy to fall into local optimum. Besides, the performance of the optimized prompt is often unstable, which limits its transferability in different tasks. To address the above challenges, we propose $\textbf{DelvePO}$ ($\textbf{D}$irection-Guid$\textbf{e}$d Se$\textbf{l}$f-E$\textbf{v}$olving Framework for Fl$\textbf{e}$xible $\textbf{P}$rompt $\textbf{O}$ptimization), a task-agnostic framework to optimize prompts in self-evolve manner. In our framework, we decouple prompts into different components that can be used to explore the impact that different factors may have on various tasks. On this basis, we introduce working memory, through which LLMs can alleviate the deficiencies caused by their own uncertainties and further obtain key insights to guide the generation of new prompts. Extensive experiments conducted on different tasks covering various domains for both open- and closed-source LLMs, including DeepSeek-R1-Distill-Llama-8B, Qwen2.5-7B-Instruct and GPT-4o-mini. Experimental results show that DelvePO consistently outperforms previous SOTA methods under identical experimental settings, demonstrating its effectiveness and transferability across different tasks.
Abstract（参考訳）: Prompt Optimizationは、様々なタスクを解決するために大規模言語モデルを操る能力のために、重要なアプローチとして登場した。しかし、現在の研究は主にLLMのランダムな書き換え能力に依存しており、最適化プロセスは一般的に特定の影響要因に重点を置いており、局所的な最適化に陥ることが容易である。さらに、最適化されたプロンプトの性能はしばしば不安定であり、異なるタスクにおける転送可能性を制限する。上記の課題に対処するため、我々は、Fl$\textbf{e}$xible $\textbf{P}$rompt $\textbf{O}$ptimizationというタスク非依存のフレームワークを、自己回避でプロンプトを最適化するためのタスク依存フレームワークとして、$\textbf{D}$irection-Guid$\textbf{e}$d Se$\textbf{l}$f-E$\textbf{v}$olving Frameworkを提案する。私たちのフレームワークでは、異なる要因が様々なタスクに与える影響を調べるために、プロンプトを異なるコンポーネントに分離します。そこで本研究では,LLMが自身の不確実性に起因する欠陥を軽減し,新たなプロンプトの生成を導く上で重要な洞察を得ることが可能なワーキングメモリについて紹介する。 DeepSeek-R1-Distill-Llama-8B, Qwen2.5-7B-Instruct, GPT-4o-miniなど, オープンソースLLMのさまざまな領域をカバーする大規模な実験を行った。実験結果から,DelvePOは従来のSOTA法を同一条件で一貫した性能を示し,その有効性および伝達性を示した。

論文の概要: DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization

関連論文リスト