SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback
- URL: http://arxiv.org/abs/2505.19514v2
- Date: Sun, 22 Jun 2025 05:21:13 GMT
- Title: SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback
- Authors: Yaoning Yu, Ye Yu, Kai Wei, Haojing Luo, Haohan Wang,
- Abstract summary: We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop framework for prompt learning.<n> SIPDO couples a synthetic data generator with a prompt, where the generator produces new examples that reveal current prompt weaknesses and refine the prompt in response.<n>This feedback-driven loop enables systematic improvement of prompt performance without assuming access to external supervision or new tasks.
- Score: 17.851957960438483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt quality plays a critical role in the performance of large language models (LLMs), motivating a growing body of work on prompt optimization. Most existing methods optimize prompts over a fixed dataset, assuming static input distributions and offering limited support for iterative improvement. We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop framework for prompt learning that integrates synthetic data generation into the optimization process. SIPDO couples a synthetic data generator with a prompt optimizer, where the generator produces new examples that reveal current prompt weaknesses and the optimizer incrementally refines the prompt in response. This feedback-driven loop enables systematic improvement of prompt performance without assuming access to external supervision or new tasks. Experiments across question answering and reasoning benchmarks show that SIPDO outperforms standard prompt tuning methods, highlighting the value of integrating data synthesis into prompt learning workflows.
Related papers
- Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models [72.4723784999432]
Large Language Models (LLMs) perform best with well-crafted prompts, yet prompt engineering remains manual, inconsistent, and inaccessible to non-experts.<n>Promptomatix transforms natural language task descriptions into high-quality prompts without requiring manual tuning or domain expertise.<n>System analyzes user intent, generates synthetic training data, selects prompting strategies, and refines prompts using cost-aware objectives.
arXiv Detail & Related papers (2025-07-17T18:18:20Z) - ORPP: Self-Optimizing Role-playing Prompts to Enhance Language Model Capabilities [64.24517317344959]
High-quality prompts are crucial for eliciting outstanding performance from large language models on complex tasks.<n>We propose ORPP, a framework that enhances model performance by optimizing and generating role-playing prompts.<n>We show that ORPP not only matches but in most cases surpasses existing mainstream prompt optimization methods in terms of performance.
arXiv Detail & Related papers (2025-06-03T05:51:35Z) - ADO: Automatic Data Optimization for Inputs in LLM Prompts [36.850626629231705]
This study explores a novel approach to enhance the performance of Large Language Models (LLMs) through the optimization of input data within prompts.<n>We introduce a two-pronged strategy for input data optimization: content engineering and structural reformulation.
arXiv Detail & Related papers (2025-02-17T04:50:41Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.
We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.
Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z) - FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema [36.65009632307124]
We propose Free-from Instruction-oriented Prompt Optimization (FIPO) to improve task performance of large language models (LLMs)<n>FIPO uses a modular APO template that dynamically integrate the naive task instruction, optional instruction responses, and optional ground truth to produce finely optimized prompts.<n>We validate FIPO framework across five public benchmarks and six testing models.
arXiv Detail & Related papers (2024-02-19T03:56:44Z) - Intent-based Prompt Calibration: Enhancing prompt optimization with
synthetic boundary cases [2.6159111710501506]
We introduce a new method for automatic prompt engineering, using a calibration process that iteratively refines the prompt to the user intent.
We demonstrate the effectiveness of our method with respect to strong proprietary models on real-world tasks such as moderation and generation.
arXiv Detail & Related papers (2024-02-05T15:28:43Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Robust Prompt Optimization for Large Language Models Against
Distribution Shifts [80.6757997074956]
Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks.
We propose a new problem of robust prompt optimization for LLMs against distribution shifts.
This problem requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group.
arXiv Detail & Related papers (2023-05-23T11:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.