Discrete Prompt Compression with Reinforcement Learning
- URL: http://arxiv.org/abs/2308.08758v3
- Date: Sun, 2 Jun 2024 10:09:01 GMT
- Title: Discrete Prompt Compression with Reinforcement Learning
- Authors: Hoyoun Jung, Kyung-Joong Kim,
- Abstract summary: Compressed prompts aid instruction-tuned language models (LMs) in overcoming context window limitations and reducing computational costs.
Existing methods, which primarily based on training embeddings, face various challenges associated with interpretability, the fixed number of embedding tokens, reusability across different LMs, and inapplicability when interacting with black-box APIs.
This study proposes prompt compression with reinforcement learning (PCRL), which is a discrete prompt compression method that addresses these issues.
- Score: 2.664293070994717
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compressed prompts aid instruction-tuned language models (LMs) in overcoming context window limitations and reducing computational costs. Existing methods, which primarily based on training embeddings, face various challenges associated with interpretability, the fixed number of embedding tokens, reusability across different LMs, and inapplicability when interacting with black-box APIs. This study proposes prompt compression with reinforcement learning (PCRL), which is a discrete prompt compression method that addresses these issues. The proposed PCRL method utilizes a computationally efficient policy network that edits prompts directly. The training approach employed in the proposed PCRLs can be applied flexibly to various types of LMs, including both decoder-only and encoder-decoder architecture and it can be trained without gradient access to the LMs or labeled data. The proposed PCRL achieves an average reduction of 24.6% in terms of the token count across various instruction prompts while maintaining sufficient performance. In addition, we demonstrate that the learned policy can be transferred to larger LMs, and through a comprehensive analysis, we explore the token importance within the prompts. Our code is accessible at https://github.com/nenomigami/PromptCompressor.
Related papers
- Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL)
We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping.
ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z) - ICPC: In-context Prompt Compression with Faster Inference [0.0]
We propose I CPC (In-context Prompt Compression), a novel and scalable prompt compression method that adaptively reduces the prompt length.
The key idea of I CPC is to calculate the probability of each word appearing in the prompt using encoders and calculate information carried by each word through the information function.
Empirically, we demonstrate that I CPC can effectively compress long texts of different categories and thus achieve better performance and speed on different types of NLP tasks.
arXiv Detail & Related papers (2025-01-03T03:46:51Z) - Gap-Filling Prompting Enhances Code-Assisted Mathematical Reasoning [0.0]
Chain-of-thought (CoT) and program-of-thought (PoT) fine-tuning are common methods to transfer LLM knowledge to small language models (SLMs)
This paper introduces Gap-Filling Prompting (GFP), a novel two-step prompting strategy designed to enhance the problem-solving process for SLMs.
arXiv Detail & Related papers (2024-11-08T08:52:59Z) - Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability [67.77534983324229]
In this paper, we investigate the ability of Large Language Models to develop a unified compression method that discretizes uninformative tokens.
Experiments show Selection-p achieves state-of-the-art performance across numerous classification tasks.
It exhibits superior transferability to different models compared to prior work.
arXiv Detail & Related papers (2024-10-15T17:05:25Z) - Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models [21.025001473355996]
We formalize the problem of prompt compression for large language models (LLMs)
We present a framework to unify token-level prompt compression methods which create hard prompts for black-box models.
We show that there is a large gap between the performance of current prompt compression methods and the optimal strategy.
arXiv Detail & Related papers (2024-07-22T09:40:13Z) - A Prompt Learning Framework for Source Code Summarization [19.24919436211323]
This paper proposes an effective prompt learning framework for code summarization called PromptCS.
PromptCS trains a prompt agent that can generate continuous prompts to unleash the potential for large language models in code summarization.
arXiv Detail & Related papers (2023-12-26T14:37:55Z) - LLMLingua: Compressing Prompts for Accelerated Inference of Large
Language Models [22.06402870816756]
Large language models (LLMs) have been applied in various applications due to their astonishing capabilities.
This paper presents LLMLingua, a coarse-to-fine prompt compression method that involves a budget controller to maintain semantic integrity.
We show that the proposed approach yields state-of-the-art performance and allows for up to 20x compression with little performance loss.
arXiv Detail & Related papers (2023-10-09T14:10:21Z) - Compressing LLMs: The Truth is Rarely Pure and Never Simple [90.05366363633568]
Knowledge-Intensive Compressed LLM BenchmarK aims to redefine the evaluation protocol for compressed Large Language Models.
LLM-KICK unveils many favorable merits and unfortunate plights of current SoTA compression methods.
LLM-KICK is designed to holistically access compressed LLMs' ability for language understanding, reasoning, generation, in-context retrieval, in-context summarization, etc.
arXiv Detail & Related papers (2023-10-02T17:42:37Z) - PromptBoosting: Black-Box Text Classification with Ten Forward Passes [61.38341243907045]
We describe PromptBoosting, a query-efficient procedure for building a text classifier from a neural language model (LM) without access to the LM's parameters, gradients, or hidden representations.
Experiments show that PromptBoosting achieves state-of-the-art performance in multiple black-box few-shot classification tasks, and matches or outperforms full fine-tuning in both few-shot and standard learning paradigms, while training 10x faster than existing black-box methods.
arXiv Detail & Related papers (2022-12-19T06:04:54Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.