I2I: Initializing Adapters with Improvised Knowledge
- URL: http://arxiv.org/abs/2304.02168v2
- Date: Mon, 10 Jul 2023 20:41:34 GMT
- Title: I2I: Initializing Adapters with Improvised Knowledge
- Authors: Tejas Srinivasan, Furong Jia, Mohammad Rostami, Jesse Thomason
- Abstract summary: Improvise for.
I2LiI, a continual learning algorithm, initializes Adapters for incoming tasks by distilling.
previously-learned tasks.
I2I consistently achieves better task accuracy than independently-trained Adapters.
- Score: 15.452979531094567
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapters present a promising solution to the catastrophic forgetting problem
in continual learning. However, training independent Adapter modules for every
new task misses an opportunity for cross-task knowledge transfer. We propose
Improvise to Initialize (I2I), a continual learning algorithm that initializes
Adapters for incoming tasks by distilling knowledge from previously-learned
tasks' Adapters. We evaluate I2I on CLiMB, a multimodal continual learning
benchmark, by conducting experiments on sequences of visual question answering
tasks. Adapters trained with I2I consistently achieve better task accuracy than
independently-trained Adapters, demonstrating that our algorithm facilitates
knowledge transfer between task Adapters. I2I also results in better cross-task
knowledge transfer than the state-of-the-art AdapterFusion without incurring
the associated parametric cost.
Related papers
- ATLAS: Adapter-Based Multi-Modal Continual Learning with a Two-Stage Learning Strategy [12.150065431702055]
We propose a multi-modal continual learning scheme that consists of experience-based learning and novel knowledge expansion.
Our method is proficient for continual learning. It expands the distribution of representation upstream while also minimizing the negative impact of forgetting previous tasks.
arXiv Detail & Related papers (2024-10-14T13:29:42Z) - Auto-selected Knowledge Adapters for Lifelong Person Re-identification [54.42307214981537]
Lifelong Person Re-Identification requires systems to continually learn from non-overlapping datasets across different times and locations.
Existing approaches, either rehearsal-free or rehearsal-based, still suffer from the problem of catastrophic forgetting.
We introduce a novel framework AdalReID, that adopts knowledge adapters and a parameter-free auto-selection mechanism for lifelong learning.
arXiv Detail & Related papers (2024-05-29T11:42:02Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Dynamic Transformer Architecture for Continual Learning of Multimodal
Tasks [27.59758964060561]
Transformer neural networks are increasingly replacing prior architectures in a wide range of applications in different data modalities.
Continual learning (CL) emerges as a solution by facilitating the transfer of knowledge across tasks that arrive sequentially for an autonomously learning agent.
We propose a transformer-based CL framework focusing on learning tasks that involve both vision and language.
arXiv Detail & Related papers (2024-01-27T03:03:30Z) - AdapterDistillation: Non-Destructive Task Composition with Knowledge
Distillation [12.648208238878468]
We propose a two-stage knowledge distillation algorithm called AdapterDistillation.
In the first stage, we extract task specific knowledge by using local data to train a student adapter.
In the second stage, we distill the knowledge from the existing teacher adapters into the student adapter to help its inference.
arXiv Detail & Related papers (2023-12-26T07:01:00Z) - MerA: Merging Pretrained Adapters For Few-Shot Learning [71.44422347502409]
We propose textbftextttMerging Pretrained Adapters (MerA) that efficiently incorporates pretrained adapters to a single model through model fusion.
Experiments on two PLMs demonstrate that MerA substantial improvements compared to both single adapters and AdapterFusion.
arXiv Detail & Related papers (2023-08-30T12:10:17Z) - E2-AEN: End-to-End Incremental Learning with Adaptively Expandable
Network [57.87240860624937]
We propose an end-to-end trainable adaptively expandable network named E2-AEN.
It dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks.
E2-AEN reduces cost and can be built upon any feed-forward architectures in an end-to-end manner.
arXiv Detail & Related papers (2022-07-14T09:04:51Z) - Fully Online Meta-Learning Without Task Boundaries [80.09124768759564]
We study how meta-learning can be applied to tackle online problems of this nature.
We propose a Fully Online Meta-Learning (FOML) algorithm, which does not require any ground truth knowledge about the task boundaries.
Our experiments show that FOML was able to learn new tasks faster than the state-of-the-art online learning methods.
arXiv Detail & Related papers (2022-02-01T07:51:24Z) - AdapterFusion: Non-Destructive Task Composition for Transfer Learning [104.9639614787314]
Sequential fine-tuning and multi-task learning are methods aiming to incorporate knowledge from multiple tasks.
We propose AdapterFusion, a new two stage learning algorithm that leverages knowledge from multiple tasks.
We show that our approach outperforms traditional strategies such as full fine-tuning as well as multi-task learning.
arXiv Detail & Related papers (2020-05-01T07:03:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.