Provable Continual Learning via Sketched Jacobian Approximations
- URL: http://arxiv.org/abs/2112.05095v1
- Date: Thu, 9 Dec 2021 18:36:20 GMT
- Title: Provable Continual Learning via Sketched Jacobian Approximations
- Authors: Reinhard Heckel
- Abstract summary: A popular approach to overcome forgetting is to regularize the loss function by penalizing models that perform poorly on previous tasks.
We show that, even under otherwise ideal conditions, it can provably suffer catastrophic forgetting if the diagonal matrix is a poor approximation of the Hessian matrix of previous tasks.
We propose a simple approach to overcome this: Regularizing training of a new task with sketches of the Jacobian matrix of past data.
- Score: 17.381658875470638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An important problem in machine learning is the ability to learn tasks in a
sequential manner. If trained with standard first-order methods most models
forget previously learned tasks when trained on a new task, which is often
referred to as catastrophic forgetting. A popular approach to overcome
forgetting is to regularize the loss function by penalizing models that perform
poorly on previous tasks. For example, elastic weight consolidation (EWC)
regularizes with a quadratic form involving a diagonal matrix build based on
past data. While EWC works very well for some setups, we show that, even under
otherwise ideal conditions, it can provably suffer catastrophic forgetting if
the diagonal matrix is a poor approximation of the Hessian matrix of previous
tasks. We propose a simple approach to overcome this: Regularizing training of
a new task with sketches of the Jacobian matrix of past data. This provably
enables overcoming catastrophic forgetting for linear models and for wide
neural networks, at the cost of memory. The overarching goal of this paper is
to provided insights on when regularization-based continual learning algorithms
work and under what memory costs.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Efficient and Generalizable Certified Unlearning: A Hessian-free Recollection Approach [8.875278412741695]
Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data.
We develop an algorithm that achieves near-instantaneous unlearning as it only requires a vector addition operation.
arXiv Detail & Related papers (2024-04-02T07:54:18Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Prior-Free Continual Learning with Unlabeled Data in the Wild [24.14279172551939]
We propose a Prior-Free Continual Learning (PFCL) method to incrementally update a trained model on new tasks.
PFCL learns new tasks without knowing the task identity or any previous data.
Our experiments show that our PFCL method significantly mitigates forgetting in all three learning scenarios.
arXiv Detail & Related papers (2023-10-16T13:59:56Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Continual Deep Learning by Functional Regularisation of Memorable Past [95.97578574330934]
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.
We propose a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting.
Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.
arXiv Detail & Related papers (2020-04-29T10:47:54Z) - Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding
Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion.
The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z) - Adversarial Incremental Learning [0.0]
Deep learning can forget previously learned information upon learning new tasks where previous data is not available.
We propose an adversarial discriminator based method that does not make use of old data at all while training on new tasks.
We are able to outperform other state-of-the-art methods on CIFAR-100, SVHN, and MNIST datasets.
arXiv Detail & Related papers (2020-01-30T02:25:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.