Context-PEFT: Efficient Multi-Modal, Multi-Task Fine-Tuning
- URL: http://arxiv.org/abs/2312.08900v1
- Date: Thu, 14 Dec 2023 13:00:24 GMT
- Title: Context-PEFT: Efficient Multi-Modal, Multi-Task Fine-Tuning
- Authors: Avelina Asada Hadji-Kyriacou, Ognjen Arandjelovic
- Abstract summary: This paper introduces a novel.
COCO-Efficient Fine-Tuning (PEFT) framework for multi-modal, multi-task transfer learning with pre-trained language models.
We propose Context-PEFT, which learns different groups of adaptor parameters based on the token's domain.
Our method is evaluated on the captioning task, where it outperforms full fine-tuning under similar data constraints.
- Score: 12.648711621637663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel Parameter-Efficient Fine-Tuning (PEFT)
framework for multi-modal, multi-task transfer learning with pre-trained
language models. PEFT techniques such as LoRA, BitFit and IA3 have demonstrated
comparable performance to full fine-tuning of pre-trained models for specific
downstream tasks, all while demanding significantly fewer trainable parameters
and reduced GPU memory consumption. However, in the context of multi-modal
fine-tuning, the need for architectural modifications or full fine-tuning often
becomes apparent. To address this we propose Context-PEFT, which learns
different groups of adaptor parameters based on the token's domain. This
approach enables LoRA-like weight injection without requiring additional
architectural changes. Our method is evaluated on the COCO captioning task,
where it outperforms full fine-tuning under similar data constraints while
simultaneously offering a substantially more parameter-efficient and
computationally economical solution.
Related papers
- Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models.
Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs.
In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - AdaptIR: Parameter Efficient Multi-task Adaptation for Pre-trained Image
Restoration Models [58.10797482129863]
We propose AdaptIR, a novel parameter efficient transfer learning method for adapting pre-trained restoration models.
Experiments demonstrate that the proposed method can achieve comparable or even better performance than full fine-tuning, while only using 0.6%.
arXiv Detail & Related papers (2023-12-12T14:27:59Z) - Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning
for Versatile Multimodal Modeling [42.42235704360381]
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on a wide range of tasks.
These large scales make it impossible to adapt and deploy fully specialized models given a task of interest.
In this work, we describe AdaLink as a non-intrusive PEFT technique that achieves competitive performance.
arXiv Detail & Related papers (2023-10-18T16:43:08Z) - Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning [30.251155072822055]
Prototype-based HyperAdapter (PHA) is a novel framework built on the adapter-tuning and hypernetwork.
It introduces an instance-dense retriever and prototypical hypernetwork to generate conditional modules in a sample-efficient manner.
We show that PHA strikes a better trade-off between trainable parameters, accuracy on stream tasks, and sample efficiency.
arXiv Detail & Related papers (2023-10-18T02:42:17Z) - DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning [14.975436239088312]
We propose DePT, which decomposes the soft prompt into a shorter soft prompt and a pair of low-rank matrices that are then optimised with two different learning rates.
We demonstrate that DePT outperforms state-of-the-art PEFT approaches, including the full fine-tuning baseline, in some scenarios.
arXiv Detail & Related papers (2023-09-11T00:02:05Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z) - AutoPEFT: Automatic Configuration Search for Parameter-Efficient
Fine-Tuning [77.61565726647784]
Motivated by advances in neural architecture search, we propose AutoPEFT for automatic PEFT configuration selection.
We show that AutoPEFT-discovered configurations significantly outperform existing PEFT methods and are on par or better than FFT without incurring substantial training efficiency costs.
arXiv Detail & Related papers (2023-01-28T08:51:23Z) - Meta-Learning the Difference: Preparing Large Language Models for
Efficient Adaptation [11.960178399478718]
Large pretrained language models (PLMs) are often domain- or task-adapted via fine-tuning or prompting.
Instead, we prepare PLMs for data- and parameter-efficient adaptation by learning to learn the difference between general and adapted PLMs.
arXiv Detail & Related papers (2022-07-07T18:00:22Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Parameter-Efficient Abstractive Question Answering over Tables or Text [60.86457030988444]
A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and generate natural answers to user queries.
Memory intensive pre-trained language models are adapted to downstream tasks such as QA by fine-tuning the model on QA data in a specific modality like unstructured text or structured tables.
To avoid training such memory-hungry models while utilizing a uniform architecture for each modality, parameter-efficient adapters add and train small task-specific bottle-neck layers between transformer layers.
arXiv Detail & Related papers (2022-04-07T10:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.