Related papers: Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks

Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks

URL: http://arxiv.org/abs/2204.04596v2
Date: Wed, 13 Apr 2022 10:28:13 GMT
Title: Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks
Authors: Haoran Yang, Piji Li, Wai Lam
Abstract summary: We propose a simple tuning method which only introduces three trainable vectors. We input the integrated hidden state(s) to a task-specific linear classifier to predict categories. This scheme is similar to the way ELMo utilises hidden states except that they feed the hidden states to LSTM-based models.
Score: 49.807185872741066
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Parameter-efficient tuning aims to distill knowledge for downstream tasks by optimizing a few introduced parameters while freezing the pretrained language models (PLMs). Continuous prompt tuning which prepends a few trainable vectors to the embeddings of input is one of these methods and has drawn much attention due to its effectiveness and efficiency. This family of methods can be illustrated as exerting nonlinear transformations of hidden states inside PLMs. However, a natural question is ignored: can the hidden states be directly used for classification without changing them? In this paper, we aim to answer this question by proposing a simple tuning method which only introduces three trainable vectors. Firstly, we integrate all layers hidden states using the introduced vectors. And then, we input the integrated hidden state(s) to a task-specific linear classifier to predict categories. This scheme is similar to the way ELMo utilises hidden states except that they feed the hidden states to LSTM-based models. Although our proposed tuning scheme is simple, it achieves comparable performance with prompt tuning methods like P-tuning and P-tuning v2, verifying that original hidden states do contain useful information for classification tasks. Moreover, our method has an advantage over prompt tuning in terms of time and the number of parameters.

Related papers

State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models [19.262293564884715]
State Space Models (SSMs) have emerged as efficient alternatives to Transformers. prompt-based methods like Prompt Tuning and Prefix-Tuning do not perform well on SSMs. We propose state-based methods as a superior alternative to prompt-based methods.
arXiv Detail & Related papers (2025-03-05T13:44:42Z)
LoFiT: Localized Fine-tuning on LLM Representations [60.99814930367597]
We introduce a framework called Localized Fine-Tuning on LLM Representations (LoFiT) LoFiT identifies a subset of attention heads that are most important for learning a specific task, then trains offset vectors to add to the model's hidden representations at those selected heads. For truthfulness and reasoning tasks, we find that LoFiT's intervention vectors are more effective for LLM adaptation than vectors from representation intervention methods such as Inference-time Intervention.
arXiv Detail & Related papers (2024-06-03T17:45:41Z)
Manifold-based Verbalizer Space Re-embedding for Tuning-free Prompt-based Classification [34.33544689818836]
We propose a tuning-free manifold-based space re-embedding method called Locally Linear Embedding with Intra-class Neighborhood Constraint. Our approach further enhances prompt-based tuning by up to 3.2%.
arXiv Detail & Related papers (2023-09-08T07:42:29Z)
On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers [47.77328392236625]
State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts. We introduce a two-stage training procedure, where we first optimize the task-specific parameters and then train the classifier with the same selection procedure of the inference time. Our method achieves results that are either superior or on par with the state of the art while being computationally cheaper.
arXiv Detail & Related papers (2023-08-18T15:11:16Z)
Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models [89.07925369856139]
We design a new type of tuning method, termed as regularized mask tuning, which masks the network parameters through a learnable selection. Inspired by neural pathways, we argue that the knowledge required by a downstream task already exists in the pre-trained weights but just gets concealed in the upstream pre-training stage. It is noteworthy that we manage to deliver 18.73% performance improvement compared to the zero-shot CLIP via masking an average of only 2.56% parameters.
arXiv Detail & Related papers (2023-07-27T17:56:05Z)
Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks. We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z)
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning [16.60284838029852]
We investigate whether one could make a task-specific selection on which subset of the layers to adapt. We propose to select layers based on the variability of their hidden states given a task-specific corpus.
arXiv Detail & Related papers (2022-10-18T17:58:43Z)
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.