PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching
- URL: http://arxiv.org/abs/2312.05621v1
- Date: Sat, 9 Dec 2023 17:38:39 GMT
- Title: PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching
- Authors: Zhenting Qi, Xiaoyu Tan, Shaojie Shi, Chao Qu, Yinghui Xu, Yuan Qi
- Abstract summary: Low-Rank Adaptation (LoRA) has become a promising alternative to instruction fine-tuning.
PILLOW aims to improve LoRA's performance by a discrimination-based LLM ability.
PILLOW exhibits commensurate performance on various evaluation metrics compared with typical instruction fine-tuning methods.
- Score: 21.835846173630717
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instruction fine-tuning has conventionally been employed to adapt Large
Language Models (LLMs) to a variety of tasks. Nonetheless, this technique often
necessitates substantial computational resources, making it impractical for
deployment by individuals or small-scale entities. Recently, Low-Rank
Adaptation (LoRA) has become a promising alternative, offering high
capabilities on par with full tuning with reduced resource overhead. However,
attaining satisfactory performance through the fine-tuning of LoRA is a
non-trivial challenge. In this paper, we propose PILLOW, which aims to improve
LoRA's performance by a discrimination-based prompting method, leveraging LLMs'
In-Context Learning ability. PILLOW incorporates a matching network that
selects prompts from a user-defined prompt pool, concatenates the selected
prompts with the user instruction as input, and performs inference using the
LoRA-fine-tuned LLMs. Trained with Reinforcement Learning, PILLOW exhibits
commensurate performance on various evaluation metrics compared with typical
instruction fine-tuning methods, utilizing only consumer-grade GPU resources
and exhibiting a large reduction in computational costs.
Related papers
- OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models [0.0]
Low-Rank Adaptation (LoRA) has emerged as a promising method to mitigate these issues.
OLoRA significantly accelerates the convergence of LLM training.
OLoRA exhibits improved performance compared to standard LoRA across a variety of language modeling tasks.
arXiv Detail & Related papers (2024-06-03T20:37:27Z) - MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning [105.11844150736536]
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models.
We propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters.
Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.
arXiv Detail & Related papers (2024-05-20T15:48:32Z) - Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks.
This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs.
We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z) - LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models [7.926974917872204]
LoRA-SP is a novel approach utilizing randomized half-selective parameter freezing.
LoRA-SP significantly reduces computational and memory requirements without compromising model performance.
arXiv Detail & Related papers (2024-02-28T06:50:10Z) - Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm.
We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z) - Prompt Optimization via Adversarial In-Context Learning [51.18075178593142]
adv-ICL is implemented as a two-player game between a generator and a discriminator.
The generator tries to generate realistic enough output to fool the discriminator.
We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques.
arXiv Detail & Related papers (2023-12-05T09:44:45Z) - A Deep Learning Based Resource Allocator for Communication Systems with
Dynamic User Utility Demands [12.216015676346032]
A DL based resource allocator (ALCOR) is introduced, which allows users to freely adjust their utility demands.
ALCOR employs deep neural networks (DNNs), as the policy, in an iterative optimization algorithm.
The policy performs unconstrained RA (URA) -- RA without taking into account user utility demands -- among active users to maximize the sum utility (SU) at each time instant.
arXiv Detail & Related papers (2023-11-08T11:02:51Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.