AutoPDL: Automatic Prompt Optimization for LLM Agents
- URL: http://arxiv.org/abs/2504.04365v1
- Date: Sun, 06 Apr 2025 05:30:10 GMT
- Title: AutoPDL: Automatic Prompt Optimization for LLM Agents
- Authors: Claudio Spiess, Mandana Vaziri, Louis Mandel, Martin Hirzel,
- Abstract summary: This paper proposes AutoPDL, an automated approach to discover good prompting agent configurations.<n>We introduce a library implementing the PDL prompt programming language.<n>We show consistent accuracy gains ($9.5pm17.5$ percentage points), up to 68.9pp, and reveal that selected prompting strategies vary across models and tasks.
- Score: 1.715270928578365
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The performance of large language models (LLMs) depends on how they are prompted, with choices spanning both the high-level prompting pattern (e.g., Zero-Shot, CoT, ReAct, ReWOO) and the specific prompt content (instructions and few-shot demonstrations). Manually tuning this combination is tedious, error-prone, and non-transferable across LLMs or tasks. Therefore, this paper proposes AutoPDL, an automated approach to discover good LLM agent configurations. Our method frames this as a structured AutoML problem over a combinatorial space of agentic and non-agentic prompting patterns and demonstrations, using successive halving to efficiently navigate this space. We introduce a library implementing common prompting patterns using the PDL prompt programming language. AutoPDL solutions are human-readable, editable, and executable PDL programs that use this library. This approach also enables source-to-source optimization, allowing human-in-the-loop refinement and reuse. Evaluations across three tasks and six LLMs (ranging from 8B to 70B parameters) show consistent accuracy gains ($9.5\pm17.5$ percentage points), up to 68.9pp, and reveal that selected prompting strategies vary across models and tasks.
Related papers
- LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE)<n>LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients.<n>It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z) - Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making.
Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance.
We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z) - AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - LLMmap: Fingerprinting For Large Language Models [15.726286532500971]
With as few as 8 interactions, LLMmap can accurately identify 42 different LLM versions with over 95% accuracy.
We discuss potential mitigations and demonstrate that, against resourceful adversaries, effective countermeasures may be challenging or even unrealizable.
arXiv Detail & Related papers (2024-07-22T17:59:45Z) - RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents [27.807695570974644]
We propose a novel method, textscRePrompt, which does agradient descent"-like approach to optimize the step-by-step instructions in the prompts given to LLM agents.
By leveraging intermediate feedback, textscRePrompt can optimize the prompt without the need for a final solution checker.
arXiv Detail & Related papers (2024-06-17T01:23:11Z) - PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Heuristic-based Sampling [20.0605311279483]
We introduce PRompt Optimization in Multi-Step Tasks (PROMST)
It incorporates human-designed feedback rules to automatically offer direct suggestions for improvement.
It significantly outperforms both human-engineered prompts and several other prompt optimization methods across 11 representative multi-step tasks.
arXiv Detail & Related papers (2024-02-13T16:38:01Z) - Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning [59.543858889996024]
Large language models (LLMs) have revolutionized a large variety of NLP tasks.<n>We show how to leverage an LLM to automatically generate NL prompts from PDDL input.
arXiv Detail & Related papers (2023-11-16T11:55:27Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.