Related papers: OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

URL: http://arxiv.org/abs/2307.03084v1
Date: Wed, 5 Jul 2023 16:30:14 GMT
Title: OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
Authors: Shengding Hu, Ning Ding, Weilin Zhao, Xingtai Lv, Zhen Zhang, Zhiyuan Liu, Maosong Sun
Abstract summary: We present OpenDelta, an open-source library that overcomes limitations by providing a plug-and-play implementation of various delta tuning methods. Our novel techniques eliminate the need to modify the backbone PTMs' code, making OpenDelta compatible with different, even novel PTMs.
Score: 81.7855202178564
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The scale of large pre-trained models (PTMs) poses significant challenges in adapting to downstream tasks due to the high optimization overhead and storage costs associated with full-parameter fine-tuning. To address this, many studies explore parameter-efficient tuning methods, also framed as "delta tuning", which updates only a small subset of parameters, known as "delta modules", while keeping the backbone model's parameters fixed. However, the practicality and flexibility of delta tuning have been limited due to existing implementations that directly modify the code of the backbone PTMs and hard-code specific delta tuning methods for each PTM. In this paper, we present OpenDelta, an open-source library that overcomes these limitations by providing a plug-and-play implementation of various delta tuning methods. Our novel techniques eliminate the need to modify the backbone PTMs' code, making OpenDelta compatible with different, even novel PTMs. OpenDelta is designed to be simple, modular, and extensible, providing a comprehensive platform for researchers and practitioners to adapt large PTMs efficiently.

Related papers

Dynamic Base model Shift for Delta Compression [53.505380509713575]
Delta compression attempts to lower the costs by reducing the redundancy of delta parameters.<n>Existing methods by default employ the pretrained model as the base model and compress the delta parameters for every task.<n>We propose Dynamic Base Model Shift (DBMS), which dynamically adapts the base model to the target task before performing delta compression.
arXiv Detail & Related papers (2025-05-16T15:11:19Z)
Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform [51.29604910007176]
We introduce Delta-DCT, the first data-free delta compression method inspired by classic JPEG image compression, leveraging the Discrete Cosine Transform (DCT) The proposed Delta-DCT does not require any training or data calibration, while achieving performance comparable to or even surpassing original finetuned models under 1-bit equivalent delta compression ratios on different kinds of models including: (1) recently-released LLMs of different sizes from 7B to 13B, (2) relatively smaller language models including RoBERTa and T5 models, (3) variants of vision transformer models, and (4) multi-modal BEiT-3 models.
arXiv Detail & Related papers (2025-03-09T16:03:48Z)
FineGates: LLMs Finetuning with Compression using Stochastic Gates [7.093692674858257]
Large Language Models (LLMs) present significant challenges for full finetuning due to the high computational demands. Lightweight finetuning techniques have been proposed, like learning low-rank adapter layers. We propose an adaptor model based on gates that simultaneously sparsify the frozen base model with task-specific adaptation.
arXiv Detail & Related papers (2024-12-17T14:33:05Z)
Dynamic Subset Tuning: Expanding the Operational Range of Parameter-Efficient Training for Large Language Models [14.762222323897978]
We propose a novel parameter-efficient training (PET) method for large language models. Unlike prior methods, this subset is not fixed in location but rather which parameters are modified over the course of training. Our method enables a seamless scaling of the subset size across an arbitrary proportion of the total model size.
arXiv Detail & Related papers (2024-11-13T13:53:10Z)
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models [39.411072236355515]
We introduce DAREx-q, a rescaling factor modification that significantly boosts performance at high pruning rates. We demonstrate that DAREx-q can be seamlessly combined with vanilla parameter-efficient fine-tuning techniques like LoRA. We revisit the application of importance-based pruning techniques within DPP, demonstrating that they outperform random-based methods when delta parameters are large.
arXiv Detail & Related papers (2024-10-12T03:21:58Z)
LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks. We propose a novel approach that employs a low rank tensor parametrization for model updates. Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
BitDelta: Your Fine-Tune May Only Be Worth One Bit [57.558376557639555]
Large Language Models (LLMs) are typically trained in two phases: pre-training on large internet-scale datasets, and fine-tuning for downstream tasks. We introduce a simple method, BitDelta, which successfully quantizes this delta down to 1 bit without compromising performance. By enabling the use of a single high-precision base model accompanied by multiple 1-bit deltas, BitDelta dramatically reduces GPU memory requirements by more than 10x.
arXiv Detail & Related papers (2024-02-15T18:50:06Z)
Context-PEFT: Efficient Multi-Modal, Multi-Task Fine-Tuning [12.648711621637663]
This paper introduces a novel. COCO-Efficient Fine-Tuning (PEFT) framework for multi-modal, multi-task transfer learning with pre-trained language models. We propose Context-PEFT, which learns different groups of adaptor parameters based on the token's domain. Our method is evaluated on the captioning task, where it outperforms full fine-tuning under similar data constraints.
arXiv Detail & Related papers (2023-12-14T13:00:24Z)
Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model [81.55141188169621]
We equip PEFT with a cross-block orchestration mechanism to enable the adaptation of the Segment Anything Model (SAM) to various downstream scenarios. We propose an intra-block enhancement module, which introduces a linear projection head whose weights are generated from a hyper-complex layer. Our proposed approach consistently improves the segmentation performance significantly on novel scenarios with only around 1K additional parameters.
arXiv Detail & Related papers (2023-11-28T11:23:34Z)
Rethinking Efficient Tuning Methods from a Unified Perspective [34.67645496324432]
We revisit the design paradigm of PETL and derive a unified framework U-Tuning for parameter-efficient transfer learning. The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning.
arXiv Detail & Related papers (2023-03-01T17:38:03Z)
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models [90.24999406296867]
In contrast with the standard fine-tuning, delta tuning only fine-tunes a small portion of the model parameters while keeping the rest untouched. Recent studies have demonstrated that a series of delta tuning methods with distinct tuned parameter selection could achieve performance on a par with full- parameter fine-tuning.
arXiv Detail & Related papers (2022-03-14T07:56:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.