Related papers: LoFiT: Localized Fine-tuning on LLM Representations

LoFiT: Localized Fine-tuning on LLM Representations

URL: http://arxiv.org/abs/2406.01563v2
Date: Thu, 31 Oct 2024 02:04:53 GMT
Title: LoFiT: Localized Fine-tuning on LLM Representations
Authors: Fangcong Yin, Xi Ye, Greg Durrett,
Abstract summary: We introduce a framework called Localized Fine-Tuning on LLM Representations (LoFiT) LoFiT identifies a subset of attention heads that are most important for learning a specific task, then trains offset vectors to add to the model's hidden representations at those selected heads. For truthfulness and reasoning tasks, we find that LoFiT's intervention vectors are more effective for LLM adaptation than vectors from representation intervention methods such as Inference-time Intervention.
Score: 60.99814930367597
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work in interpretability shows that large language models (LLMs) can be adapted for new tasks in a learning-free way: it is possible to intervene on LLM representations to elicit desired behaviors for alignment. For instance, adding certain bias vectors to the outputs of certain attention heads is reported to boost the truthfulness of models. In this work, we show that localized fine-tuning serves as an effective alternative to such representation intervention methods. We introduce a framework called Localized Fine-Tuning on LLM Representations (LoFiT), which identifies a subset of attention heads that are most important for learning a specific task, then trains offset vectors to add to the model's hidden representations at those selected heads. LoFiT localizes to a sparse set of heads (3%-10%) and learns the offset vectors from limited training data, comparable to the settings used for representation intervention. For truthfulness and reasoning tasks, we find that LoFiT's intervention vectors are more effective for LLM adaptation than vectors from representation intervention methods such as Inference-time Intervention. We also find that the localization step is important: selecting a task-specific set of attention heads can lead to higher performance than intervening on heads selected for a different task. Finally, across 7 tasks we study, LoFiT achieves comparable performance to other parameter-efficient fine-tuning methods such as LoRA, despite modifying 20x-200x fewer parameters than these methods.

Related papers

Improving Reasoning Performance in Large Language Models via Representation Engineering [2.0099933815960256]
We propose a representation engineering approach for large language models (LLMs) Model activations are read from the residual stream of an LLM when processing a reasoning task. We show that an LLM can, to a certain degree, be controlled to improve its perceived reasoning ability by modulating activations.
arXiv Detail & Related papers (2025-04-28T04:58:43Z)
Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models [9.075759687357204]
We show that large language models (LLMs) sometimes fail to respond appropriately to deterministic tasks. We use mechanistic interpretability techniques to localize the prior within the LLM and manipulate the extent to which that prior influences its responses.
arXiv Detail & Related papers (2025-04-17T02:00:53Z)
ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning [29.001249598245]
We introduce Reward-Oriented inStruction data sElection to optimize data selection for task-specific instruction tuning. ROSE adapts an influence formulation to approximate the influence of training data points relative to a few-shot preference validation set to select the most task-related training data points.
arXiv Detail & Related papers (2024-12-01T01:01:09Z)
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback [54.10302745921713]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors. We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z)
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning [105.11844150736536]
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. We propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters. Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.
arXiv Detail & Related papers (2024-05-20T15:48:32Z)
Distribution-Aware Prompt Tuning for Vision-Language Models [20.02599087680773]
A key to prompt tuning is the feature space alignment between two modalities via learnable vectors with model parameters fixed. Inspired by this observation, we proposed distribution-aware prompt tuning (DAPT) for vision-language models. Our experiments on 11 benchmark datasets demonstrate that our method significantly improves generalizability.
arXiv Detail & Related papers (2023-09-06T23:49:11Z)
On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers [47.77328392236625]
State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts. We introduce a two-stage training procedure, where we first optimize the task-specific parameters and then train the classifier with the same selection procedure of the inference time. Our method achieves results that are either superior or on par with the state of the art while being computationally cheaper.
arXiv Detail & Related papers (2023-08-18T15:11:16Z)
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks [49.807185872741066]
We propose a simple tuning method which only introduces three trainable vectors. We input the integrated hidden state(s) to a task-specific linear classifier to predict categories. This scheme is similar to the way ELMo utilises hidden states except that they feed the hidden states to LSTM-based models.
arXiv Detail & Related papers (2022-04-10T04:14:02Z)
Task-guided Disentangled Tuning for Pretrained Language Models [16.429787408467703]
We propose Task-guided Disentangled Tuning (TDT) for pretrained language models (PLMs) TDT enhances the generalization of representations by disentangling task-relevant signals from entangled representations. Experimental results on GLUE and CLUE benchmarks show that TDT gives consistently better results than fine-tuning with different PLMs.
arXiv Detail & Related papers (2022-03-22T03:11:39Z)
Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks. In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand. We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.