Dissecting Continual Learning a Structural and Data Analysis
- URL: http://arxiv.org/abs/2301.01033v1
- Date: Tue, 3 Jan 2023 10:37:11 GMT
- Title: Dissecting Continual Learning a Structural and Data Analysis
- Authors: Francesco Pelosin
- Abstract summary: Continual Learning is a field dedicated to devise algorithms able to achieve lifelong learning.
Deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions.
When we expose such systems to this incremental setting, performance drop very quickly.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual Learning (CL) is a field dedicated to devise algorithms able to
achieve lifelong learning. Overcoming the knowledge disruption of previously
acquired concepts, a drawback affecting deep learning models and that goes by
the name of catastrophic forgetting, is a hard challenge. Currently, deep
learning methods can attain impressive results when the data modeled does not
undergo a considerable distributional shift in subsequent learning sessions,
but whenever we expose such systems to this incremental setting, performance
drop very quickly. Overcoming this limitation is fundamental as it would allow
us to build truly intelligent systems showing stability and plasticity.
Secondly, it would allow us to overcome the onerous limitation of retraining
these architectures from scratch with the new updated data. In this thesis, we
tackle the problem from multiple directions. In a first study, we show that in
rehearsal-based techniques (systems that use memory buffer), the quantity of
data stored in the rehearsal buffer is a more important factor over the quality
of the data. Secondly, we propose one of the early works of incremental
learning on ViTs architectures, comparing functional, weight and attention
regularization approaches and propose effective novel a novel asymmetric loss.
At the end we conclude with a study on pretraining and how it affects the
performance in Continual Learning, raising some questions about the effective
progression of the field. We then conclude with some future directions and
closing remarks.
Related papers
- Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Primal Dual Continual Learning: Balancing Stability and Plasticity through Adaptive Memory Allocation [86.8475564814154]
We show that it is both possible and beneficial to undertake the constrained optimization problem directly.
We focus on memory-based methods, where a small subset of samples from previous tasks can be stored in a replay buffer.
We show that dual variables indicate the sensitivity of the optimal value of the continual learning problem with respect to constraint perturbations.
arXiv Detail & Related papers (2023-09-29T21:23:27Z) - Federated Unlearning via Active Forgetting [24.060724751342047]
We propose a novel federated unlearning framework based on incremental learning.
Our framework differs from existing federated unlearning methods that rely on approximate retraining or data influence estimation.
arXiv Detail & Related papers (2023-07-07T03:07:26Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - An Empirical Investigation of the Role of Pre-training in Lifelong
Learning [21.995593026269578]
We show that generic pre-training implicitly alleviates the effects of catastrophic forgetting when learning multiple tasks sequentially.
We study this phenomenon by analyzing the loss landscape, finding that pre-trained weights appear to ease forgetting by leading to wider minima.
arXiv Detail & Related papers (2021-12-16T19:00:55Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Continual Learning via Bit-Level Information Preserving [88.32450740325005]
We study the continual learning process through the lens of information theory.
We propose Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters.
BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning.
arXiv Detail & Related papers (2021-05-10T15:09:01Z) - A Mathematical Analysis of Learning Loss for Active Learning in
Regression [2.792030485253753]
This paper develops a foundation for Learning Loss which enables us to propose a novel modification we call LearningLoss++.
We show that gradients are crucial in interpreting how Learning Loss works, with rigorous analysis and comparison of the gradients between Learning Loss and LearningLoss++.
We also propose a convolutional architecture that combines features at different scales to predict the loss.
We show that LearningLoss++ outperforms in identifying scenarios where the model is likely to perform poorly, which on model refinement translates into reliable performance in the open world.
arXiv Detail & Related papers (2021-04-19T13:54:20Z) - Understanding Catastrophic Forgetting and Remembering in Continual
Learning with Optimal Relevance Mapping [10.970706194360451]
Catastrophic forgetting in neural networks is a significant problem for continual learning.
We introduce Relevance Mapping Networks (RMNs) which are inspired by the Optimal Overlap Hypothesis.
We show that RMNs learn an optimized representational overlap that overcomes the twin problem of catastrophic forgetting and remembering.
arXiv Detail & Related papers (2021-02-22T20:34:00Z) - Generative Feature Replay with Orthogonal Weight Modification for
Continual Learning [20.8966035274874]
generative replay is a promising strategy which generates and replays pseudo data for previous tasks to alleviate catastrophic forgetting.
We propose to replay penultimate layer feature with a generative model; 2) leverage a self-supervised auxiliary task to further enhance the stability of feature.
Empirical results on several datasets show our method always achieves substantial improvement over powerful OWM.
arXiv Detail & Related papers (2020-05-07T13:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.