TKIL: Tangent Kernel Approach for Class Balanced Incremental Learning
- URL: http://arxiv.org/abs/2206.08492v1
- Date: Fri, 17 Jun 2022 00:20:54 GMT
- Title: TKIL: Tangent Kernel Approach for Class Balanced Incremental Learning
- Authors: Jinlin Xiang and Eli Shlizerman
- Abstract summary: Class incremental learning methods aim to keep a memory of a few exemplars from previously learned tasks, and distilling knowledge from them.
Existing methods struggle to balance the performance across classes since they typically overfit the model to the latest task.
We introduce a novel methodology of Tangent Kernel for Incremental Learning (TKIL) achieves that class-balanced performance.
- Score: 4.822598110892847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When learning new tasks in a sequential manner, deep neural networks tend to
forget tasks that they previously learned, a phenomenon called catastrophic
forgetting. Class incremental learning methods aim to address this problem by
keeping a memory of a few exemplars from previously learned tasks, and
distilling knowledge from them. However, existing methods struggle to balance
the performance across classes since they typically overfit the model to the
latest task. In our work, we propose to address these challenges with the
introduction of a novel methodology of Tangent Kernel for Incremental Learning
(TKIL) that achieves class-balanced performance. The approach preserves the
representations across classes and balances the accuracy for each class, and as
such achieves better overall accuracy and variance. TKIL approach is based on
Neural Tangent Kernel (NTK), which describes the convergence behavior of neural
networks as a kernel function in the limit of infinite width. In TKIL, the
gradients between feature layers are treated as the distance between the
representations of these layers and can be defined as Gradients Tangent Kernel
loss (GTK loss) such that it is minimized along with averaging weights. This
allows TKIL to automatically identify the task and to quickly adapt to it
during inference. Experiments on CIFAR-100 and ImageNet datasets with various
incremental learning settings show that these strategies allow TKIL to
outperform existing state-of-the-art methods.
Related papers
- How Feature Learning Can Improve Neural Scaling Laws [86.9540615081759]
We develop a solvable model of neural scaling laws beyond the kernel limit.
We show how performance scales with model size, training time, and the total amount of available data.
arXiv Detail & Related papers (2024-09-26T14:05:32Z) - Efficient kernel surrogates for neural network-based regression [0.8030359871216615]
We study the performance of the Conjugate Kernel (CK), an efficient approximation to the Neural Tangent Kernel (NTK)
We show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior.
In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework suggests a recipe for improving DNN accuracy inexpensively.
arXiv Detail & Related papers (2023-10-28T06:41:47Z) - Clustering-based Domain-Incremental Learning [4.835091081509403]
Key challenge in continual learning is the so-called "catastrophic forgetting problem"
We propose an online clustering-based approach on a dynamically updated finite pool of samples or gradients.
We demonstrate the effectiveness of the proposed strategy and its promising performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-09-21T13:49:05Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent
Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions.
We show this disparity can largely be attributed to challenges presented by non-NISTity.
We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Center Loss Regularization for Continual Learning [0.0]
In general, neural networks lack the ability to learn different tasks sequentially.
Our approach remembers old tasks by projecting the representations of new tasks close to that of old tasks.
We demonstrate that our approach is scalable, effective, and gives competitive performance compared to state-of-the-art continual learning methods.
arXiv Detail & Related papers (2021-10-21T17:46:44Z) - Scaling Neural Tangent Kernels via Sketching and Random Features [53.57615759435126]
Recent works report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets.
We design a near input-sparsity time approximation algorithm for NTK, by sketching the expansions of arc-cosine kernels.
We show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup.
arXiv Detail & Related papers (2021-06-15T04:44:52Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.