Supervisory Prompt Training
- URL: http://arxiv.org/abs/2403.18051v1
- Date: Tue, 26 Mar 2024 19:08:20 GMT
- Title: Supervisory Prompt Training
- Authors: Jean Ghislain Billa, Min Oh, Liang Du,
- Abstract summary: We propose a novel approach, Supervisory Prompt Training (SPT)
SPT automates the generation of highly effective prompts using a dual Large Language Models (LLMs) system.
In this system, one LLM, the generator, performs a task while the other, the corrector, provides feedback and generates improved prompts.
- Score: 2.0431551512846244
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The performance of Large Language Models (LLMs) relies heavily on the quality of prompts, which are often manually engineered and task-specific, making them costly and non-scalable. We propose a novel approach, Supervisory Prompt Training (SPT). SPT automates the generation of highly effective prompts using a dual LLM system. In this system, one LLM, the generator, performs a task while the other, the corrector, provides feedback and generates improved prompts. In contrast to earlier techniques, both the generator and corrector collaboratively and continuously improve their prompts over time. We also introduce the concept of \textit{impact scores} to measure the sentence-level effectiveness of the prompts. Our method was tested on four benchmarks, testing the level of hallucinations in LLMs. Notably, we were able to increase the accuracy of GPT-4 on GSM8K from 65.8\% to 94.1\% (28.3\% increase). SPT advances LLMs by refining prompts to enhance performance and reduce hallucinations, offering an efficient and scalable alternative to traditional model fine-tuning.
Related papers
- Iterative Prompting with Persuasion Skills in Jailbreaking Large Language Models [2.1511703382556657]
This study exploits large language models (LLMs) with an iterative prompting technique.
We analyze the response patterns of LLMs, including GPT-3.5, GPT-4, LLaMa2, Vicuna, and ChatGLM.
Persuading strategies enhance prompt effectiveness while maintaining consistency with malicious intent.
arXiv Detail & Related papers (2025-03-26T08:40:46Z) - Prompt Alchemy: Automatic Prompt Refinement for Enhancing Code Generation [19.745848581060528]
Prochemy is an innovative method for automatically refining prompts to boost code generation.
It iteratively refines prompts based on model performance, using an optimized final prompt for improved consistency across tasks.
For code translation, Prochemy boosts GPT-4o's Java-to-Python (AVATAR) performance from 74.5 to 84.1 (+12.9%) and Python-to-Java from 66.8 to 78.2 (+17.1%)
arXiv Detail & Related papers (2025-03-14T04:53:03Z) - The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation [17.064672221710307]
Large Language Models (LLMs) can generate useful test cases for given source code.
The existing work primarily relies on human-written plain prompts.
arXiv Detail & Related papers (2025-01-02T16:30:05Z) - GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers [52.17222304851524]
We introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning.
By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models.
GReaTer consistently outperforms previous state-of-the-art prompt optimization methods.
arXiv Detail & Related papers (2024-12-12T20:59:43Z) - Learning from Contrastive Prompts: Automated Optimization and Adaptation [7.455360923031003]
We propose the Learning from Contrastive Prompts (LCP) framework to enhance prompt optimization and adaptation.
LCP employs contrastive learning to generate effective prompts by analyzing patterns in good and bad prompt examples.
Our evaluation on the Big-Bench Hard dataset shows that LCP has a win rate of over 76% over existing methods in prompt optimization.
arXiv Detail & Related papers (2024-09-23T16:47:23Z) - Self-Instructed Derived Prompt Generation Meets In-Context Learning: Unlocking New Potential of Black-Box LLMs [30.333277284839053]
Large language models (LLMs) have shown success in generating high-quality responses.
Existing methods to enhance response quality often involve a prompt refinement model.
We introduce a self-instructed in-context learning framework that empowers LLMs to deliver more effective responses.
arXiv Detail & Related papers (2024-09-03T02:42:39Z) - Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation [22.124234811959532]
Large language models (LLMs) exhibit significant drawbacks when processing long contexts.
We propose a novel RAG prompting methodology, which can be directly applied to pre-trained transformer-based LLMs.
We demonstrate the capability of our method to simultaneously enhance time efficiency across a variety of question-answering benchmarks.
arXiv Detail & Related papers (2024-04-10T11:03:17Z) - Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks.
This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs.
We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z) - Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models [84.94220787791389]
We propose Fact-and-Reflection (FaR) prompting, which improves the LLM calibration in two steps.
Experiments show that FaR achieves significantly better calibration; it lowers the Expected Error by 23.5%.
FaR even elicits the capability of verbally expressing concerns in less confident scenarios.
arXiv Detail & Related papers (2024-02-27T01:37:23Z) - AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations [52.43593893122206]
Alignedcot is an in-context learning technique for invoking Large Language Models.
It achieves consistent and correct step-wise prompts in zero-shot scenarios.
We conduct experiments on mathematical reasoning and commonsense reasoning.
arXiv Detail & Related papers (2023-11-22T17:24:21Z) - PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine [24.888093229577965]
We propose a simple, universal, and automatic method named PREFER to address the stated limitations.
Our PREFER achieves state-of-the-art performance in multiple types of tasks by a significant margin.
arXiv Detail & Related papers (2023-08-23T09:46:37Z) - Self-Refine: Iterative Refinement with Self-Feedback [62.78755306241981]
Self-Refine is an approach for improving initial outputs from large language models (LLMs) through iterative feedback and refinement.
We evaluate Self-Refine across 7 diverse tasks, ranging from dialog response generation to mathematical reasoning, using state-of-the-art (GPT-3.5, ChatGPT, and GPT-4) LLMs.
Our work demonstrates that even state-of-the-art LLMs like GPT-4 can be further improved at test time using our simple, standalone approach.
arXiv Detail & Related papers (2023-03-30T18:30:01Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.