Kernel Continual Learning
- URL: http://arxiv.org/abs/2107.05757v2
- Date: Wed, 14 Jul 2021 23:49:54 GMT
- Title: Kernel Continual Learning
- Authors: Mohammad Mahdi Derakhshani, Xiantong Zhen, Ling Shao, Cees G. M. Snoek
- Abstract summary: kernel continual learning is a simple but effective variant of continual learning to tackle catastrophic forgetting.
episodic memory unit stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression.
variational random features to learn a data-driven kernel for each task.
- Score: 117.79080100313722
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper introduces kernel continual learning, a simple but effective
variant of continual learning that leverages the non-parametric nature of
kernel methods to tackle catastrophic forgetting. We deploy an episodic memory
unit that stores a subset of samples for each task to learn task-specific
classifiers based on kernel ridge regression. This does not require memory
replay and systematically avoids task interference in the classifiers. We
further introduce variational random features to learn a data-driven kernel for
each task. To do so, we formulate kernel continual learning as a variational
inference problem, where a random Fourier basis is incorporated as the latent
variable. The variational posterior distribution over the random Fourier basis
is inferred from the coreset of each task. In this way, we are able to generate
more informative kernels specific to each task, and, more importantly, the
coreset size can be reduced to achieve more compact memory, resulting in more
efficient continual learning based on episodic memory. Extensive evaluation on
four benchmarks demonstrates the effectiveness and promise of kernels for
continual learning.
Related papers
- Amortized Inference for Gaussian Process Hyperparameters of Structured
Kernels [5.1672267755831705]
Amortizing parameter inference over different datasets is a promising approach to dramatically speed up training time.
We propose amortizing kernel parameter inference over a complete kernel-structure-family rather than a fixed kernel structure.
We show drastically reduced inference time combined with competitive test performance for a large set of kernels and datasets.
arXiv Detail & Related papers (2023-06-16T13:02:57Z) - RFFNet: Large-Scale Interpretable Kernel Methods via Random Fourier Features [3.0079490585515347]
We introduce RFFNet, a scalable method that learns the kernel relevances' on the fly via first-order optimization.
We show that our approach has a small memory footprint and run-time, low prediction error, and effectively identifies relevant features.
We supply users with an efficient, PyTorch-based library, that adheres to the scikit-learn standard API and code for fully reproducing our results.
arXiv Detail & Related papers (2022-11-11T18:50:34Z) - Generative Kernel Continual learning [117.79080100313722]
We introduce generative kernel continual learning, which exploits the synergies between generative models and kernels for continual learning.
The generative model is able to produce representative samples for kernel learning, which removes the dependence on memory in kernel continual learning.
We conduct extensive experiments on three widely-used continual learning benchmarks that demonstrate the abilities and benefits of our contributions.
arXiv Detail & Related papers (2021-12-26T16:02:10Z) - MetaKernel: Learning Variational Random Features with Limited Labels [120.90737681252594]
Few-shot learning deals with the fundamental and challenging problem of learning from a few annotated samples, while being able to generalize well on new tasks.
We propose meta-learning kernels with random Fourier features for few-shot learning, we call Meta Kernel.
arXiv Detail & Related papers (2021-05-08T21:24:09Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z) - Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data [93.76907759950608]
We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
arXiv Detail & Related papers (2020-08-14T05:46:56Z) - Optimal Rates of Distributed Regression with Imperfect Kernels [0.0]
We study the distributed kernel regression via the divide conquer and conquer approach.
We show that the kernel ridge regression can achieve rates faster than $N-1$ in the noise free setting.
arXiv Detail & Related papers (2020-06-30T13:00:16Z) - Learning to Learn Kernels with Variational Random Features [118.09565227041844]
We introduce kernels with random Fourier features in the meta-learning framework to leverage their strong few-shot learning ability.
We formulate the optimization of MetaVRF as a variational inference problem.
We show that MetaVRF delivers much better, or at least competitive, performance compared to existing meta-learning alternatives.
arXiv Detail & Related papers (2020-06-11T18:05:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.