Parameter-Efficient Language Model Tuning with Active Learning in
Low-Resource Settings
- URL: http://arxiv.org/abs/2305.14576v2
- Date: Mon, 23 Oct 2023 13:58:26 GMT
- Title: Parameter-Efficient Language Model Tuning with Active Learning in
Low-Resource Settings
- Authors: Josip Juki\'c, Jan \v{S}najder
- Abstract summary: We study the interplay between active learning (AL) and parameter-efficient fine-tuning (PEFT) in low-resource settings for text classification tasks.
Our findings affirm the superiority of PEFT over full-fine tuning (FFT) in low-resource settings and demonstrate that this advantage persists in AL setups.
Our research underscores the synergistic potential of AL and PEFT in low-resource settings, paving the way for advancements in efficient and effective fine-tuning.
- Score: 3.490038106567192
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models (PLMs) have ignited a surge in demand for
effective fine-tuning techniques, particularly in low-resource domains and
languages. Active learning (AL), a set of algorithms designed to decrease
labeling costs by minimizing label complexity, has shown promise in confronting
the labeling bottleneck. In parallel, adapter modules designed for
parameter-efficient fine-tuning (PEFT) have demonstrated notable potential in
low-resource settings. However, the interplay between AL and adapter-based PEFT
remains unexplored. We present an empirical study of PEFT behavior with AL in
low-resource settings for text classification tasks. Our findings affirm the
superiority of PEFT over full-fine tuning (FFT) in low-resource settings and
demonstrate that this advantage persists in AL setups. We further examine the
properties of PEFT and FFT through the lens of forgetting dynamics and
instance-level representations, where we find that PEFT yields more stable
representations of early and middle layers compared to FFT. Our research
underscores the synergistic potential of AL and PEFT in low-resource settings,
paving the way for advancements in efficient and effective fine-tuning.
Related papers
- From LLMs to Edge: Parameter-Efficient Fine-Tuning on Edge Devices [3.4233698915405544]
This paper benchmarks and analyzes popular PEFT methods on convolutional architectures typically deployed in resource-constrained edge environments.<n>We find that the evaluated PEFT methods are only half as memory-efficient when applied to depthwise-separable convolution architectures.
arXiv Detail & Related papers (2025-07-31T13:23:21Z) - Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections [65.36449542323277]
We present a unified theoretical framework bridgingSupervised Fine-Tuning (SFT) and preference learning in Large Language Model (LLM) post-training.<n>We propose a simple yet effective learning rate reduction approach that yields significant performance improvements.
arXiv Detail & Related papers (2025-06-15T05:42:29Z) - FLoE: Fisher-Based Layer Selection for Efficient Sparse Adaptation of Low-Rank Experts [47.35092228595656]
FLoE is a novel PEFT framework that introduces two key innovations: (i) a Fisher information-guided importance scoring mechanism to dynamically identify task-critical transformer layers for MoE-based low-rank adaptation, enabling sparse adapter deployment; and (ii) a Bayesian optimization-driven rank allocator that automatically determines optimal LoRA ranks on specific datasets without exhaustive grid search.<n>Experiments across diverse LLMs and benchmarks reveal that FLoE achieves impressive efficiency-accuracy trade-offs, making FLoE particularly advantageous in resource-constrained environments that necessitate rapid adaptation.
arXiv Detail & Related papers (2025-05-31T10:27:08Z) - Look Within or Look Beyond? A Theoretical Comparison Between Parameter-Efficient and Full Fine-Tuning [50.05207363001145]
Efficient Fine-Tuning (PEFT) methods achieve performance comparable to Full Fine-Tuning (FFT)<n>We compare the characteristics of PEFT and FFT in terms of representational capacity and robustness based on optimization theory.<n>Experiments on 15 datasets encompassing classification, generation, reasoning, instruction fine-tuning tasks and 11 adversarial test sets validate our theories.
arXiv Detail & Related papers (2025-05-28T13:35:12Z) - Parameter-Efficient Fine-Tuning with Column Space Projection [4.379304291229695]
We propose PiCa, the first theoretically grounded PEFT method based on the spectral properties of fine-tuned weights.<n>We show that PiCa achieves the state-of-the-art performance compared to existing PEFT methods.
arXiv Detail & Related papers (2025-05-26T16:52:40Z) - A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Method-Level Code Smell Detection [11.9757082688031]
Existing detection methods, relying on Codes or Machine Learning (ML) and Deep Learning (DL) techniques, often face limitations such as unsatisfactory performance.
This study evaluates state-of-the-art PEFT methods on both small and large Language Models for detecting two types of method-level code smells: Complex Conditional and Complex Method.
Results show that PEFT methods achieve comparable or better performance than full fine-tuning while consuming less GPU memory.
arXiv Detail & Related papers (2024-12-18T12:48:36Z) - Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models [14.68920095399595]
sparsity-based PEFT (SPEFT) introduces trainable sparse adaptations to the weight matrices in the model.
We conduct the first systematic evaluation of salience metrics for SPEFT, inspired by zero-cost NAS proxies.
Our work challenges the notion that complexity is necessary for effective PEFT.
arXiv Detail & Related papers (2024-12-18T04:14:35Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.
Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models [19.163639128631534]
Importance-aware Sparse Tuning (IST) is a plug-and-play technique compatible with various PEFT methods that operate on a per-layer basis.
IST dynamically updates selected layers in PEFT modules, leading to reduced memory demands.
arXiv Detail & Related papers (2024-10-15T16:53:26Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.
We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.
Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models [63.52035708182815]
We introduce a novel Budget-guided Iterative search strategy for automatic PEFT (BIPEFT)
BIPEFT employs a new iterative search strategy to disentangle the binary module and rank dimension search spaces.
Extensive experiments on public benchmarks demonstrate the superior performance of BIPEFT for downstream tasks with a low parameter budget.
arXiv Detail & Related papers (2024-10-04T18:50:46Z) - Lessons and Insights from a Unifying Study of Parameter-Efficient Fine-Tuning (PEFT) in Visual Recognition [36.031972728327894]
We conduct a unifying empirical study of representative PEFT methods with Vision Transformers.
We find that different PEFT methods achieve similar accuracy in the low-shot benchmark VTAB-1K.
Despite similar accuracy, we find that PEFT methods make different mistakes and high-confidence predictions, likely due to their different inductive biases.
arXiv Detail & Related papers (2024-09-24T19:57:40Z) - Exploring Parameter-Efficient Fine-Tuning of Large Language Model on Automated Program Repair [5.6679735367798925]
"Pre-training and fine-tuning" paradigm enables Large Language Models (LLMs) improve fixing capabilities on Automated Program Repair (APR)
We employ prompt engineering to create an instruction dataset, APR-INSTRUCTION, at first to fill this gap.
The best fine-tuned model fixes 58% more bugs than the state-of-the-art LLM-based APR techniques.
arXiv Detail & Related papers (2024-06-09T04:42:19Z) - Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation [3.558269947618352]
We evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score.
We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests.
The Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.
arXiv Detail & Related papers (2024-04-05T16:42:28Z) - Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R [1.9799527196428242]
We evaluate PEFT methods, LoRA, Compacter, and IA3 on Large Language Models for code summarization and generation.
Our experiments reveal that LoRA consistently outperforms Compacter and IA3 in all settings.
Our study can direct future research in developing code intelligent tasks for unseen languages including R.
arXiv Detail & Related papers (2024-03-16T03:12:45Z) - LoRETTA: Low-Rank Economic Tensor-Train Adaptation for
Ultra-Low-Parameter Fine-Tuning of Large Language Models [20.5908375260123]
Various parameter-efficient fine-tuning (PEFT) techniques have been proposed to enable computationally efficient fine-tuning while maintaining model performance.
We present LoRETTA, a framework that significantly reduces trainable parameters through tensor-train decomposition.
LoRETTA achieves comparable or better performance than most widely used PEFT methods with up to $100times$ fewer parameters on the LLaMA-2-7B models.
arXiv Detail & Related papers (2024-02-18T01:20:00Z) - Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An
Empirical Study [10.052053069122652]
PEFT has demonstrated superior performance and lower computational overhead in several code understanding tasks.
It harnesses the pre-trained general-purpose knowledge for downstream tasks.
It remains unclear whether PEFT outperforms FMFT in task-specific adaptation for code-change-related tasks.
arXiv Detail & Related papers (2024-02-09T08:40:41Z) - From PEFT to DEFT: Parameter Efficient Finetuning for Reducing Activation Density in Transformers [52.199303258423306]
We propose a novel density loss that encourages higher activation sparsity in pre-trained models.
Our proposed method, textbfDEFT, can consistently reduce activation density by up to textbf44.94% on RoBERTa$_mathrmLarge$ and by textbf53.19% (encoder density) and textbf90.60% (decoder density) on Flan-T5$_mathrmXXL$.
arXiv Detail & Related papers (2024-02-02T21:25:46Z) - Parameter and Computation Efficient Transfer Learning for
Vision-Language Pre-trained Models [79.34513906324727]
In this paper, we aim at parameter and efficient transfer learning (PCETL) for vision-language pre-trained models.
We propose a novel dynamic architecture skipping (DAS) approach towards effective PCETL.
arXiv Detail & Related papers (2023-09-04T09:34:33Z) - Layer-wise Feedback Propagation [53.00944147633484]
We present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors.
LFP assigns rewards to individual connections based on their respective contributions to solving a given task.
We demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Strong Baselines for Parameter Efficient Few-Shot Fine-tuning [50.83426196335385]
Few-shot classification (FSC) entails learning novel classes given only a few examples per class after a pre-training (or meta-training) phase.
Recent works have shown that simply fine-tuning a pre-trained Vision Transformer (ViT) on new test classes is a strong approach for FSC.
Fine-tuning ViTs, however, is expensive in time, compute and storage.
This has motivated the design of parameter efficient fine-tuning (PEFT) methods which fine-tune only a fraction of the Transformer's parameters.
arXiv Detail & Related papers (2023-04-04T16:14:39Z) - AutoPEFT: Automatic Configuration Search for Parameter-Efficient
Fine-Tuning [77.61565726647784]
Motivated by advances in neural architecture search, we propose AutoPEFT for automatic PEFT configuration selection.
We show that AutoPEFT-discovered configurations significantly outperform existing PEFT methods and are on par or better than FFT without incurring substantial training efficiency costs.
arXiv Detail & Related papers (2023-01-28T08:51:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.