Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights
Modification
- URL: http://arxiv.org/abs/2111.10078v1
- Date: Fri, 19 Nov 2021 07:40:48 GMT
- Title: Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights
Modification
- Authors: Yanni Li and Bing Liu and Kaicheng Yao and Xiaoli Kou and Pengfan Lv
and Yueshen Xu and Jiangtao Cui
- Abstract summary: We show that of the weight gradient of a new learning task is determined by both the input space of the new task and the weight space of the previous learned tasks sequentially.
We propose a new efficient and effective continual learning method EOWM via enhanced OWM.
- Score: 8.091211518374598
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The ability of neural networks (NNs) to learn and remember multiple tasks
sequentially is facing tough challenges in achieving general artificial
intelligence due to their catastrophic forgetting (CF) issues. Fortunately, the
latest OWM Orthogonal Weights Modification) and other several continual
learning (CL) methods suggest some promising ways to overcome the CF issue.
However, none of existing CL methods explores the following three crucial
questions for effectively overcoming the CF issue: that is, what knowledge does
it contribute to the effective weights modification of the NN during its
sequential tasks learning? When the data distribution of a new learning task
changes corresponding to the previous learned tasks, should a uniform/specific
weight modification strategy be adopted or not? what is the upper bound of the
learningable tasks sequentially for a given CL method? ect. To achieve this, in
this paper, we first reveals the fact that of the weight gradient of a new
learning task is determined by both the input space of the new task and the
weight space of the previous learned tasks sequentially. On this observation
and the recursive least square optimal method, we propose a new efficient and
effective continual learning method EOWM via enhanced OWM. And we have
theoretically and definitively given the upper bound of the learningable tasks
sequentially of our EOWM. Extensive experiments conducted on the benchmarks
demonstrate that our EOWM is effectiveness and outperform all of the
state-of-the-art CL baselines.
Related papers
- Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement [69.51496713076253]
In this paper, we focus on the aforementioned efficiency aspects of existing MTL methods.
We first carry out large-scale experiments of the methods with smaller backbones and on a the MetaGraspNet dataset as a new test ground.
We also propose Feature Disentanglement measure as a novel and efficient identifier of the challenges in MTL.
arXiv Detail & Related papers (2024-02-05T22:15:55Z) - Clustering-based Domain-Incremental Learning [4.835091081509403]
Key challenge in continual learning is the so-called "catastrophic forgetting problem"
We propose an online clustering-based approach on a dynamically updated finite pool of samples or gradients.
We demonstrate the effectiveness of the proposed strategy and its promising performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-09-21T13:49:05Z) - Self-paced Weight Consolidation for Continual Learning [39.27729549041708]
Continual learning algorithms are popular in preventing catastrophic forgetting in sequential task learning settings.
We propose a self-paced Weight Consolidation (spWC) framework to attain continual learning.
arXiv Detail & Related papers (2023-07-20T13:07:41Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Exclusive Supermask Subnetwork Training for Continual Learning [95.5186263127864]
Continual Learning (CL) methods focus on accumulating knowledge over time while avoiding forgetting.
We propose ExSSNeT (Exclusive Supermask SubNEtwork Training), that performs exclusive and non-overlapping subnetwork weight training.
We demonstrate that ExSSNeT outperforms strong previous methods on both NLP and Vision domains while preventing forgetting.
arXiv Detail & Related papers (2022-10-18T23:27:07Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - DIODE: Dilatable Incremental Object Detection [15.59425584971872]
Conventional deep learning models lack the capability of preserving previously learned knowledge.
We propose a dilatable incremental object detector (DIODE) for multi-step incremental detection tasks.
Our method achieves up to 6.4% performance improvement by increasing the number of parameters by just 1.2% for each newly learned task.
arXiv Detail & Related papers (2021-08-12T09:45:57Z) - Rectification-based Knowledge Retention for Continual Learning [49.1447478254131]
Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting.
We propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner.
Our approach can be used in both the zero-shot and non zero-shot task incremental learning settings.
arXiv Detail & Related papers (2021-03-30T18:11:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.