A Unified Continual Learning Framework with General Parameter-Efficient
Tuning
- URL: http://arxiv.org/abs/2303.10070v2
- Date: Sat, 19 Aug 2023 14:51:01 GMT
- Title: A Unified Continual Learning Framework with General Parameter-Efficient
Tuning
- Authors: Qiankun Gao, Chen Zhao, Yifan Sun, Teng Xi, Gang Zhang, Bernard
Ghanem, Jian Zhang
- Abstract summary: "Pre-training $rightarrow$ downstream adaptation" presents both new opportunities and challenges for Continual Learning.
We position prompting as one instantiation of PET, and propose a unified CL framework, dubbed as Learning-Accumulation-Ensemble (LAE)
PET, e.g., using Adapter, LoRA, or Prefix, can adapt a pre-trained model to downstream tasks with fewer parameters and resources.
- Score: 56.250772378174446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The "pre-training $\rightarrow$ downstream adaptation" presents both new
opportunities and challenges for Continual Learning (CL). Although the recent
state-of-the-art in CL is achieved through Parameter-Efficient-Tuning (PET)
adaptation paradigm, only prompt has been explored, limiting its application to
Transformers only. In this paper, we position prompting as one instantiation of
PET, and propose a unified CL framework with general PET, dubbed as
Learning-Accumulation-Ensemble (LAE). PET, e.g., using Adapter, LoRA, or
Prefix, can adapt a pre-trained model to downstream tasks with fewer parameters
and resources. Given a PET method, our LAE framework incorporates it for CL
with three novel designs. 1) Learning: the pre-trained model adapts to the new
task by tuning an online PET module, along with our adaptation speed
calibration to align different PET modules, 2) Accumulation: the task-specific
knowledge learned by the online PET module is accumulated into an offline PET
module through momentum update, 3) Ensemble: During inference, we respectively
construct two experts with online/offline PET modules (which are favored by the
novel/historical tasks) for prediction ensemble. We show that LAE is compatible
with a battery of PET methods and gains strong CL capability. For example, LAE
with Adaptor PET surpasses the prior state-of-the-art by 1.3% and 3.6% in
last-incremental accuracy on CIFAR100 and ImageNet-R datasets, respectively.
Code is available at \url{https://github.com/gqk/LAE}.
Related papers
- Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity [2.576214343259399]
We propose PET-Select, a PET-agnostic selection model that uses code complexity as a proxy to classify queries.
PET-Select distinguishes between simple and complex problems, allowing it to choose PETs that are best suited for each query's complexity level.
Our evaluations on the MBPP and HumanEval benchmarks show up to a 1.9% improvement in pass@1 accuracy, along with a 74.8% reduction in token usage.
arXiv Detail & Related papers (2024-09-24T19:28:55Z) - HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning [55.88910947643436]
We propose a unified framework for continual learning (CL) with pre-trained models (PTMs) and parameter-efficient tuning (PET)
We present Hierarchical Decomposition PET (HiDe-PET), an innovative approach that explicitly optimize the objective through incorporating task-specific and task-shared knowledge.
Our approach demonstrates remarkably superior performance over a broad spectrum of recent strong baselines.
arXiv Detail & Related papers (2024-07-07T01:50:25Z) - SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world.
Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input.
We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z) - ConPET: Continual Parameter-Efficient Tuning for Large Language Models [65.48107393731861]
Continual learning requires continual adaptation of models to newly emerging tasks.
We propose Continual.
Efficient Tuning (ConPET), a generalizable paradigm for.
continual task adaptation of large language models.
arXiv Detail & Related papers (2023-09-26T08:52:04Z) - VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity
Control [44.73827206809393]
In vision-and-language (VL), parameter-efficient tuning (PET) techniques are proposed to integrate modular modifications into encoder-decoder PLMs.
We propose a Vision-and-Language.
Efficient Tuning (VL-PET) framework to impose effective control over modular modifications.
arXiv Detail & Related papers (2023-08-18T20:18:30Z) - Exploring the Impact of Model Scaling on Parameter-Efficient Tuning [100.61202305296275]
Scaling-efficient tuning (PET) methods can effectively drive extremely large pre-trained language models (PLMs)
In small PLMs, there are usually noticeable performance differences among PET methods.
We introduce a more flexible PET method called Arbitrary PET (APET) method.
arXiv Detail & Related papers (2023-06-04T10:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.