Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
- URL: http://arxiv.org/abs/2506.05629v1
- Date: Thu, 05 Jun 2025 23:13:22 GMT
- Title: Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
- Authors: Ananth Muppidi, Abhilash Nandy, Sambaran Bandyopadhyay,
- Abstract summary: This paper focuses on parameter-efficient fine-tuning using soft prompting.<n>We propose a novel Input Dependent Soft Prompting technique with a self-Attention Mechanism (ID-SPAM)<n>We show the merits of the proposed approach compared to state-of-the-art techniques on various tasks and show the improved zero shot domain transfer capability.
- Score: 17.838462425090498
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The performance of large language models in domain-specific tasks necessitates fine-tuning, which is computationally expensive and technically challenging. This paper focuses on parameter-efficient fine-tuning using soft prompting, a promising approach that adapts pre-trained models to downstream tasks by learning a small set of parameters. We propose a novel Input Dependent Soft Prompting technique with a self-Attention Mechanism (ID-SPAM) that generates soft prompts based on the input tokens and attends different tokens with varying importance. Our method is simple and efficient, keeping the number of trainable parameters small. We show the merits of the proposed approach compared to state-of-the-art techniques on various tasks and show the improved zero shot domain transfer capability.
Related papers
- FedDPG: An Adaptive Yet Efficient Prompt-tuning Approach in Federated Learning Settings [23.33217268142489]
This paper introduces the Federated Dynamic Prompt Generator (FedDPG)<n>FedDPG incorporates a dynamic prompt generator network to generate context-aware prompts based on the given input.<n>Experiments on three NLP benchmark datasets showcase that FedDPG outperforms the state-of-the-art parameter-efficient fine-tuning methods in terms of global model performance.
arXiv Detail & Related papers (2025-07-22T03:47:12Z) - Achieving More with Less: Additive Prompt Tuning for Rehearsal-Free Class-Incremental Learning [76.32953653161417]
Class-incremental learning enables models to learn new classes progressively while preserving knowledge of previously learned ones.<n>Recent advances in this field have shifted towards parameter-efficient fine-tuning techniques.<n>We present a novel prompt-based approach that addresses the limitation of current approaches.
arXiv Detail & Related papers (2025-03-11T02:27:37Z) - Efficient Prompting Methods for Large Language Models: A Survey [50.82812214830023]
Efficient Prompting Methods have attracted a wide range of attention.<n>We discuss Automatic Prompt Engineering for different prompt components and Prompt Compression in continuous and discrete spaces.
arXiv Detail & Related papers (2024-04-01T12:19:08Z) - RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models [4.085425430499285]
We explore the impact of altering the input text of the original task in conjunction with parameter-efficient fine-tuning methods.
To most effectively rewrite the input text, we train a few-shot paraphrase model with a Maximum-Marginal Likelihood objective.
We show that enriching data with paraphrases at train and test time enhances the performance beyond what can be achieved with parameter-efficient fine-tuning alone.
arXiv Detail & Related papers (2024-03-04T17:58:09Z) - MPrompt: Exploring Multi-level Prompt Tuning for Machine Reading
Comprehension [19.12663587559988]
We propose a multi-level prompt tuning (MPrompt) method for machine reading comprehension.
It utilizes prompts at task-specific, domain-specific, and context-specific levels to enhance the comprehension of input semantics.
We conducted extensive experiments on 12 benchmarks of various QA formats and achieved an average improvement of 1.94% over the state-of-the-art methods.
arXiv Detail & Related papers (2023-10-27T14:24:06Z) - MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text
Classification [65.51149771074944]
MetricPrompt eases verbalizer design difficulty by reformulating few-shot text classification task into text pair relevance estimation task.
We conduct experiments on three widely used text classification datasets across four few-shot settings.
Results show that MetricPrompt outperforms manual verbalizer and other automatic verbalizer design methods across all few-shot settings.
arXiv Detail & Related papers (2023-06-15T06:51:35Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z) - IDPG: An Instance-Dependent Prompt Generation Method [58.45110542003139]
Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.
We propose a conditional prompt generation method to generate prompts for each input instance.
arXiv Detail & Related papers (2022-04-09T15:45:27Z) - Making Pre-trained Language Models End-to-end Few-shot Learners with
Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models.
It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters.
Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.