Efficient Expansion and Gradient Based Task Inference for Replay Free
Incremental Learning
- URL: http://arxiv.org/abs/2312.01188v1
- Date: Sat, 2 Dec 2023 17:28:52 GMT
- Title: Efficient Expansion and Gradient Based Task Inference for Replay Free
Incremental Learning
- Authors: Soumya Roy, Vinay K Verma and Deepak Gupta
- Abstract summary: Recent expansion based models show promising results for task incremental learning (TIL)
For class incremental learning (CIL), prediction of task id is a crucial challenge.
We propose a robust task prediction method that leverages entropy weighted data augmentations and the models gradient using pseudo labels.
- Score: 5.760774528950479
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a simple but highly efficient expansion-based model for
continual learning. The recent feature transformation, masking and
factorization-based methods are efficient, but they grow the model only over
the global or shared parameter. Therefore, these approaches do not fully
utilize the previously learned information because the same task-specific
parameter forgets the earlier knowledge. Thus, these approaches show limited
transfer learning ability. Moreover, most of these models have constant
parameter growth for all tasks, irrespective of the task complexity. Our work
proposes a simple filter and channel expansion based method that grows the
model over the previous task parameters and not just over the global parameter.
Therefore, it fully utilizes all the previously learned information without
forgetting, which results in better knowledge transfer. The growth rate in our
proposed model is a function of task complexity; therefore for a simple task,
the model has a smaller parameter growth while for complex tasks, the model
requires more parameters to adapt to the current task. Recent expansion based
models show promising results for task incremental learning (TIL). However, for
class incremental learning (CIL), prediction of task id is a crucial challenge;
hence, their results degrade rapidly as the number of tasks increase. In this
work, we propose a robust task prediction method that leverages entropy
weighted data augmentations and the models gradient using pseudo labels. We
evaluate our model on various datasets and architectures in the TIL, CIL and
generative continual learning settings. The proposed approach shows
state-of-the-art results in all these settings. Our extensive ablation studies
show the efficacy of the proposed components.
Related papers
- Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token [0.6144680854063939]
Deep learning models display catastrophic forgetting when learning new data continuously.
We present a novel method that preserves previous knowledge without storing previous data.
This method is inspired by the architecture of a vision transformer and employs a unique token capable of encapsulating the compressed knowledge of each task.
arXiv Detail & Related papers (2024-11-06T16:13:50Z) - TaE: Task-aware Expandable Representation for Long Tail Class Incremental Learning [42.630413950957795]
We introduce a novel Task-aware Expandable (TaE) framework to learn diverse representations from each incremental task.
TaE achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-02-08T16:37:04Z) - Continual Learning via Bit-Level Information Preserving [88.32450740325005]
We study the continual learning process through the lens of information theory.
We propose Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters.
BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning.
arXiv Detail & Related papers (2021-05-10T15:09:01Z) - Efficient Feature Transformations for Discriminative and Generative
Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z) - Lifelong Learning Without a Task Oracle [13.331659934508764]
Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned.
We propose and compare several candidate task-assigning mappers which require very little memory overhead.
Best-performing variants only impose an average cost of 1.7% parameter memory increase.
arXiv Detail & Related papers (2020-11-09T21:30:31Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z) - iTAML: An Incremental Task-Agnostic Meta-learning Approach [123.10294801296926]
Humans can continuously learn new knowledge as their experience grows.
Previous learning in deep neural networks can quickly fade out when they are trained on a new task.
We introduce a novel meta-learning approach that seeks to maintain an equilibrium between all encountered tasks.
arXiv Detail & Related papers (2020-03-25T21:42:48Z) - Lifelong Learning with Searchable Extension Units [21.17631355880764]
We propose a new lifelong learning framework named Searchable Extension Units (SEU)
It breaks down the need for a predefined original model and searches for specific extension units for different tasks.
Our approach can obtain a much more compact model without catastrophic forgetting.
arXiv Detail & Related papers (2020-03-19T03:45:51Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.