Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning
- URL: http://arxiv.org/abs/2405.13978v1
- Date: Wed, 22 May 2024 20:29:15 GMT
- Title: Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning
- Authors: Prashant Bhat, Bharath Renjith, Elahe Arani, Bahram Zonooz,
- Abstract summary: Attention-Guided Incremental Learning' (AGILE) is a rehearsal-based CL approach that incorporates compact task attention to effectively reduce interference between tasks.
AGILE significantly improves generalization performance by mitigating task interference and outperforming rehearsal-based approaches in several CL scenarios.
- Score: 17.236861687708096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning (CL) remains a significant challenge for deep neural networks, as it is prone to forgetting previously acquired knowledge. Several approaches have been proposed in the literature, such as experience rehearsal, regularization, and parameter isolation, to address this problem. Although almost zero forgetting can be achieved in task-incremental learning, class-incremental learning remains highly challenging due to the problem of inter-task class separation. Limited access to previous task data makes it difficult to discriminate between classes of current and previous tasks. To address this issue, we propose `Attention-Guided Incremental Learning' (AGILE), a novel rehearsal-based CL approach that incorporates compact task attention to effectively reduce interference between tasks. AGILE utilizes lightweight, learnable task projection vectors to transform the latent representations of a shared task attention module toward task distribution. Through extensive empirical evaluation, we show that AGILE significantly improves generalization performance by mitigating task interference and outperforming rehearsal-based approaches in several CL scenarios. Furthermore, AGILE can scale well to a large number of tasks with minimal overhead while remaining well-calibrated with reduced task-recency bias.
Related papers
- Overcoming Domain Drift in Online Continual Learning [24.86094018430407]
Online Continual Learning (OCL) empowers machine learning models to acquire new knowledge online across a sequence of tasks.
OCL faces a significant challenge: catastrophic forgetting, wherein the model learned in previous tasks is substantially overwritten upon encountering new tasks.
We propose a novel rehearsal strategy, Drift-Reducing Rehearsal (DRR), to anchor the domain of old tasks and reduce the negative transfer effects.
arXiv Detail & Related papers (2024-05-15T06:57:18Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Efficient Rehearsal Free Zero Forgetting Continual Learning using
Adaptive Weight Modulation [3.6683171094134805]
Continual learning involves acquiring knowledge of multiple tasks over an extended period.
Most approaches to this problem seek a balance between maximizing performance on the new tasks and minimizing the forgetting of previous tasks.
Our approach attempts to maximize the performance of the new task, while ensuring zero forgetting.
arXiv Detail & Related papers (2023-11-26T12:36:05Z) - Active Continual Learning: On Balancing Knowledge Retention and
Learnability [43.6658577908349]
Acquiring new knowledge without forgetting what has been learned in a sequence of tasks is the central focus of continual learning (CL)
This paper considers the under-explored problem of active continual learning (ACL) for a sequence of active learning (AL) tasks.
We investigate the effectiveness and interplay between several AL and CL algorithms in the domain, class and task-incremental scenarios.
arXiv Detail & Related papers (2023-05-06T04:11:03Z) - Dense Network Expansion for Class Incremental Learning [61.00081795200547]
State-of-the-art approaches use a dynamic architecture based on network expansion (NE), in which a task expert is added per task.
A new NE method, dense network expansion (DNE), is proposed to achieve a better trade-off between accuracy and model complexity.
It outperforms the previous SOTA methods by a margin of 4% in terms of accuracy, with similar or even smaller model scale.
arXiv Detail & Related papers (2023-03-22T16:42:26Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Task Agnostic Representation Consolidation: a Self-supervised based
Continual Learning Approach [14.674494335647841]
We propose a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning.
We show that our training paradigm can be easily added to memory- or regularization-based approaches.
arXiv Detail & Related papers (2022-07-13T15:16:51Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.