Generative Kernel Continual learning
- URL: http://arxiv.org/abs/2112.13410v1
- Date: Sun, 26 Dec 2021 16:02:10 GMT
- Title: Generative Kernel Continual learning
- Authors: Mohammad Mahdi Derakhshani and Xiantong Zhen and Ling Shao and Cees G.
M. Snoek
- Abstract summary: We introduce generative kernel continual learning, which exploits the synergies between generative models and kernels for continual learning.
The generative model is able to produce representative samples for kernel learning, which removes the dependence on memory in kernel continual learning.
We conduct extensive experiments on three widely-used continual learning benchmarks that demonstrate the abilities and benefits of our contributions.
- Score: 117.79080100313722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Kernel continual learning by \citet{derakhshani2021kernel} has recently
emerged as a strong continual learner due to its non-parametric ability to
tackle task interference and catastrophic forgetting. Unfortunately its success
comes at the expense of an explicit memory to store samples from past tasks,
which hampers scalability to continual learning settings with a large number of
tasks. In this paper, we introduce generative kernel continual learning, which
explores and exploits the synergies between generative models and kernels for
continual learning. The generative model is able to produce representative
samples for kernel learning, which removes the dependence on memory in kernel
continual learning. Moreover, as we replay only on the generative model, we
avoid task interference while being computationally more efficient compared to
previous methods that need replay on the entire model. We further introduce a
supervised contrastive regularization, which enables our model to generate even
more discriminative samples for better kernel-based classification performance.
We conduct extensive experiments on three widely-used continual learning
benchmarks that demonstrate the abilities and benefits of our contributions.
Most notably, on the challenging SplitCIFAR100 benchmark, with just a simple
linear kernel we obtain the same accuracy as kernel continual learning with
variational random features for one tenth of the memory, or a 10.1\% accuracy
gain for the same memory budget.
Related papers
- Class incremental learning with probability dampening and cascaded gated classifier [4.285597067389559]
We propose a novel incremental regularisation approach called Margin Dampening and Cascaded Scaling.
The first combines a soft constraint and a knowledge distillation approach to preserve past knowledge while allowing forgetting new patterns.
We empirically show that our approach performs well on multiple benchmarks well-established baselines.
arXiv Detail & Related papers (2024-02-02T09:33:07Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Learning Curves for Sequential Training of Neural Networks:
Self-Knowledge Transfer and Forgetting [9.734033555407406]
We consider neural networks in the neural tangent kernel regime that continually learn target functions from task to task.
We investigate a variant of continual learning where the model learns the same target function in multiple tasks.
Even for the same target, the trained model shows some transfer and forgetting depending on the sample size of each task.
arXiv Detail & Related papers (2021-12-03T00:25:01Z) - Kernel Continual Learning [117.79080100313722]
kernel continual learning is a simple but effective variant of continual learning to tackle catastrophic forgetting.
episodic memory unit stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression.
variational random features to learn a data-driven kernel for each task.
arXiv Detail & Related papers (2021-07-12T22:09:30Z) - MetaKernel: Learning Variational Random Features with Limited Labels [120.90737681252594]
Few-shot learning deals with the fundamental and challenging problem of learning from a few annotated samples, while being able to generalize well on new tasks.
We propose meta-learning kernels with random Fourier features for few-shot learning, we call Meta Kernel.
arXiv Detail & Related papers (2021-05-08T21:24:09Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.