Related papers: Plum: Prompt Learning using Metaheuristic

Plum: Prompt Learning using Metaheuristic

URL: http://arxiv.org/abs/2311.08364v3
Date: Sun, 30 Jun 2024 09:50:11 GMT
Title: Plum: Prompt Learning using Metaheuristic
Authors: Rui Pan, Shuo Xing, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang,
Abstract summary: We introduce metaheuristics, a branch of discrete non-visual optimization methods with over 100 options. Within our paradigm, we test six typical methods, demonstrating their effectiveness in white-box and black-box prompt learning. We show that these methods can be used to discover more human-understandable prompts, opening the door to a cornucopia of possibilities in prompt optimization.
Score: 28.024094195968672
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete non-convex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in white-box and black-box prompt learning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown in both reasoning and image generation tasks, opening the door to a cornucopia of possibilities in prompt optimization. We release all the codes in \url{https://github.com/research4pan/Plum}.

Related papers

Why is prompting hard? Understanding prompts on binary sequence predictors [19.855572748273236]
Large language models (LLMs) can be prompted to do many tasks. Finding good prompts is not always easy, nor is understanding some performant prompts.
arXiv Detail & Related papers (2025-02-15T10:55:47Z)
Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning [50.26965628047682]
Adapting pre-trained models to open classes is a challenging problem in machine learning. In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach. Our proposed method outperforms all comparison methods on average considering both base and new classes.
arXiv Detail & Related papers (2024-08-29T12:34:01Z)
Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks. This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs. We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z)
Understanding prompt engineering may not require rethinking generalization [56.38207873589642]
We show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are remarkably tight by the standards of the literature. This work provides a possible justification for the widespread practice of prompt engineering.
arXiv Detail & Related papers (2023-10-06T00:52:48Z)
Multi-Prompt with Depth Partitioned Cross-Modal Learning [25.239388488952375]
Partitioned Multi-modal Prompt (PMPO) is a multi-modal prompting technique that extends the soft prompt from a single learnable prompt to multiple prompts. Our method divides the visual encoder depths and connects learnable prompts to the separated visual depths, enabling different prompts to capture hierarchical contextual depths. We evaluate the effectiveness of our approach on three challenging tasks: new class generalization, cross-dataset evaluation, and domain generalization.
arXiv Detail & Related papers (2023-05-10T14:54:29Z)
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery [55.905769757007185]
We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization. Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications. In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.
arXiv Detail & Related papers (2023-02-07T18:40:18Z)
PromptBoosting: Black-Box Text Classification with Ten Forward Passes [61.38341243907045]
We describe PromptBoosting, a query-efficient procedure for building a text classifier from a neural language model (LM) without access to the LM's parameters, gradients, or hidden representations. Experiments show that PromptBoosting achieves state-of-the-art performance in multiple black-box few-shot classification tasks, and matches or outperforms full fine-tuning in both few-shot and standard learning paradigms, while training 10x faster than existing black-box methods.
arXiv Detail & Related papers (2022-12-19T06:04:54Z)
TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA) In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge. Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z)
Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem. Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts. We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z)
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts [33.43689407735244]
Natural-language prompts have recently been used to coax pretrained language models into performing other AI tasks. We explore the idea of learning prompts by gradient descent. For each task, we optimize a mixture of prompts, learning which prompts are most effective and how to ensemble them.
arXiv Detail & Related papers (2021-04-14T02:56:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.