Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning
for Versatile Multimodal Modeling
- URL: http://arxiv.org/abs/2310.12100v1
- Date: Wed, 18 Oct 2023 16:43:08 GMT
- Title: Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning
for Versatile Multimodal Modeling
- Authors: Yaqing Wang, Jialin Wu, Tanmaya Dabral, Jiageng Zhang, Geoff Brown,
Chun-Ta Lu, Frederick Liu, Yi Liang, Bo Pang, Michael Bendersky, Radu Soricut
- Abstract summary: Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on a wide range of tasks.
These large scales make it impossible to adapt and deploy fully specialized models given a task of interest.
In this work, we describe AdaLink as a non-intrusive PEFT technique that achieves competitive performance.
- Score: 42.42235704360381
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) and vision language models (VLMs) demonstrate
excellent performance on a wide range of tasks by scaling up parameter counts
from O(10^9) to O(10^{12}) levels and further beyond. These large scales make
it impossible to adapt and deploy fully specialized models given a task of
interest. Parameter-efficient fine-tuning (PEFT) emerges as a promising
direction to tackle the adaptation and serving challenges for such large
models. We categorize PEFT techniques into two types: intrusive and
non-intrusive. Intrusive PEFT techniques directly change a model's internal
architecture. Though more flexible, they introduce significant complexities for
training and serving. Non-intrusive PEFT techniques leave the internal
architecture unchanged and only adapt model-external parameters, such as
embeddings for input. In this work, we describe AdaLink as a non-intrusive PEFT
technique that achieves competitive performance compared to SoTA intrusive PEFT
(LoRA) and full model fine-tuning (FT) on various tasks. We evaluate using both
text-only and multimodal tasks, with experiments that account for both
parameter-count scaling and training regime (with and without instruction
tuning).
Related papers
- Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models [24.62337386603331]
Large Multi-modal Models (LMMs) are revolutionizing the way machines interact with the world.
To adapt LMMs for downstream tasks, parameter-efficient fine-tuning (PEFT) has gained popularity.
This paper focuses on the strengths and weaknesses of each tuning strategy, shifting the focus from the efficiency typically associated with these approaches.
arXiv Detail & Related papers (2024-10-29T07:55:50Z) - Context-PEFT: Efficient Multi-Modal, Multi-Task Fine-Tuning [12.648711621637663]
This paper introduces a novel.
COCO-Efficient Fine-Tuning (PEFT) framework for multi-modal, multi-task transfer learning with pre-trained language models.
We propose Context-PEFT, which learns different groups of adaptor parameters based on the token's domain.
Our method is evaluated on the captioning task, where it outperforms full fine-tuning under similar data constraints.
arXiv Detail & Related papers (2023-12-14T13:00:24Z) - ComPEFT: Compression for Communicating Parameter Efficient Updates via
Sparsification and Quantization [100.90624220423634]
We present ComPEFT, a novel method for compressing fine-tuning residuals (task vectors) of PEFT based models.
In extensive evaluation across T5, T0, and LLaMA-based models with 200M - 65B parameters, ComPEFT achieves compression ratios of 8x - 50x.
arXiv Detail & Related papers (2023-11-22T05:28:59Z) - MatFormer: Nested Transformer for Elastic Inference [94.1789252941718]
MatFormer is a nested Transformer architecture designed to offer elasticity in a variety of deployment constraints.
We show that a 2.6B decoder-only MatFormer language model (MatLM) allows us to extract smaller models spanning from 1.5B to 2.6B.
We also observe that smaller encoders extracted from a universal MatFormer-based ViT (MatViT) encoder preserve the metric-space structure for adaptive large-scale retrieval.
arXiv Detail & Related papers (2023-10-11T17:57:14Z) - DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning [14.975436239088312]
We propose DePT, which decomposes the soft prompt into a shorter soft prompt and a pair of low-rank matrices that are then optimised with two different learning rates.
We demonstrate that DePT outperforms state-of-the-art PEFT approaches, including the full fine-tuning baseline, in some scenarios.
arXiv Detail & Related papers (2023-09-11T00:02:05Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - AutoPEFT: Automatic Configuration Search for Parameter-Efficient
Fine-Tuning [77.61565726647784]
Motivated by advances in neural architecture search, we propose AutoPEFT for automatic PEFT configuration selection.
We show that AutoPEFT-discovered configurations significantly outperform existing PEFT methods and are on par or better than FFT without incurring substantial training efficiency costs.
arXiv Detail & Related papers (2023-01-28T08:51:23Z) - When does Parameter-Efficient Transfer Learning Work for Machine
Translation? [8.862707047517913]
Prior work indicates that PEFTs may not work as well for machine translation (MT)
We conduct a comprehensive empirical study of PEFTs for MT, considering (1) various parameter budgets, (2) a diverse set of language-pairs, and (3) different pre-trained models.
We find that using PEFTs with a larger pre-trained model outperforms full fine-tuning with a smaller model, and for smaller training data sizes, PEFTs outperform full fine-tuning for the same pre-trained model.
arXiv Detail & Related papers (2022-05-23T12:49:46Z) - UniPELT: A Unified Framework for Parameter-Efficient Language Model
Tuning [64.638804236566]
We propose a unified framework, UniPELT, which incorporates different PELT methods as submodules and learns to activate the ones that best suit the current data or task setup.
Remarkably, on the GLUE benchmark, UniPELT consistently achieves 13pt gains compared to the best individual PELT method that it incorporates and even outperforms fine-tuning under different setups.
arXiv Detail & Related papers (2021-10-14T17:40:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.