Attentional Mixtures of Soft Prompt Tuning for Parameter-efficient
Multi-task Knowledge Sharing
- URL: http://arxiv.org/abs/2205.11961v1
- Date: Tue, 24 May 2022 10:48:33 GMT
- Title: Attentional Mixtures of Soft Prompt Tuning for Parameter-efficient
Multi-task Knowledge Sharing
- Authors: Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh
Hajishirzi
- Abstract summary: ATTEMPT is a new modular, multi-task, and parameter-efficient language model (LM) tuning approach.
It combines knowledge transferred across different tasks via a mixture of soft prompts while keeping original LM unchanged.
It is parameter-efficient (e.g., updates 1,600 times fewer parameters than fine-tuning) and enables multi-task learning and flexible extensions.
- Score: 53.399742232323895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work introduces ATTEMPT (Attentional Mixture of Prompt Tuning), a new
modular, multi-task, and parameter-efficient language model (LM) tuning
approach that combines knowledge transferred across different tasks via a
mixture of soft prompts while keeping original LM unchanged. ATTEMPT
interpolates a set of prompts trained on large-scale source tasks and a newly
initialized target task prompt using instance-wise attention computed by a
lightweight sub-network trained on multiple target tasks. ATTEMPT is
parameter-efficient (e.g., updates 1,600 times fewer parameters than
fine-tuning) and enables multi-task learning and flexible extensions;
importantly, it is also more interpretable because it demonstrates which source
tasks affect the final model decision on target tasks. Experimental results
across 17 diverse datasets show that ATTEMPT improves prompt tuning by up to a
22% absolute performance gain and outperforms or matches fully fine-tuned or
other parameter-efficient tuning approaches that use over ten times more
parameters.
Related papers
- Enhancing Few-Shot Transfer Learning with Optimized Multi-Task Prompt Tuning through Modular Prompt Composition [0.0]
Multi-task prompt tuning has garnered considerable attention for its inherent modularity and potential to enhance parameter-efficient transfer learning.
This paper aims to analyze and improve the performance of multiple tasks by facilitating the transfer of knowledge between their corresponding prompts in a multi-task setting.
arXiv Detail & Related papers (2024-08-23T17:01:51Z) - MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts [6.245113492272563]
Mixture of Dyadic Experts (MoDE) is a novel design for efficient multi-task adaptation.
Our design allows for more fine-grained mixing, thereby increasing the model's ability to jointly handle multiple tasks.
arXiv Detail & Related papers (2024-08-02T18:05:10Z) - Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning [50.73666458313015]
Large Language Models (LLMs) have demonstrated significant potential in performing multiple tasks in multimedia applications.
MoE has been emerged as a promising solution with its sparse architecture for effective task decoupling.
Intuition-MoR1E achieves superior efficiency and 2.15% overall accuracy improvement across 14 public datasets.
arXiv Detail & Related papers (2024-04-13T12:14:58Z) - Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning [43.639430661322585]
We propose multitask prompt tuning (MPT)
MPT learns a single transferable prompt by distilling knowledge from multiple task-specific source prompts.
We then learn multiplicative low rank updates to this shared prompt to efficiently adapt it to each downstream target task.
arXiv Detail & Related papers (2023-03-06T03:25:59Z) - AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task
Learning [19.201899503691266]
We measure the task dominance degree of a parameter by the total updates of each task on this parameter.
We propose a Task-wise Adaptive learning rate approach, AdaTask, to separate the emphaccumulative gradients and hence the learning rate of each task.
Experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks.
arXiv Detail & Related papers (2022-11-28T04:24:38Z) - Multitask Vision-Language Prompt Tuning [103.5967011236282]
We propose multitask vision-language prompt tuning (MV)
MV incorporates cross-task knowledge into prompt tuning for vision-language models.
Results in 20 vision tasks demonstrate that the proposed approach outperforms all single-task baseline prompt tuning methods.
arXiv Detail & Related papers (2022-11-21T18:41:44Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer [7.2462572989580405]
We propose a novel prompt-based transfer learning approach called SPoT: Soft Prompt Transfer.
We show SPoT significantly boosts the performance of PromptTuning across many tasks.
We also conduct a large-scale study on task transferability with 26 NLP tasks and 160 combinations of source-target tasks.
arXiv Detail & Related papers (2021-10-15T07:35:58Z) - Using a thousand optimization tasks to learn hyperparameter search
strategies [53.318615663332274]
We present TaskSet, a dataset of neural tasks for use in training and evaluating neurals.
TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional networks, to variational autoencoders, to non-volume preserving flows on a variety of datasets.
arXiv Detail & Related papers (2020-02-27T02:49:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.