Subspace Distillation for Continual Learning
- URL: http://arxiv.org/abs/2307.16419v2
- Date: Tue, 1 Aug 2023 06:45:22 GMT
- Title: Subspace Distillation for Continual Learning
- Authors: Kaushik Roy, Christian Simon, Peyman Moghadam, Mehrtash Harandi
- Abstract summary: We propose a knowledge distillation technique that takes into account the manifold structure of a neural network in learning novel tasks.
We demonstrate that the modeling with subspaces provides several intriguing properties, including robustness to noise.
Empirically, we observe that our proposed method outperforms various continual learning methods on several challenging datasets.
- Score: 27.22147868163214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An ultimate objective in continual learning is to preserve knowledge learned
in preceding tasks while learning new tasks. To mitigate forgetting prior
knowledge, we propose a novel knowledge distillation technique that takes into
the account the manifold structure of the latent/output space of a neural
network in learning novel tasks. To achieve this, we propose to approximate the
data manifold up-to its first order, hence benefiting from linear subspaces to
model the structure and maintain the knowledge of a neural network while
learning novel concepts. We demonstrate that the modeling with subspaces
provides several intriguing properties, including robustness to noise and
therefore effective for mitigating Catastrophic Forgetting in continual
learning. We also discuss and show how our proposed method can be adopted to
address both classification and segmentation problems. Empirically, we observe
that our proposed method outperforms various continual learning methods on
several challenging datasets including Pascal VOC, and Tiny-Imagenet.
Furthermore, we show how the proposed method can be seamlessly combined with
existing learning approaches to improve their performances. The codes of this
article will be available at https://github.com/csiro-robotics/SDCL.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge.
The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes.
We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z) - A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - Function-space Parameterization of Neural Networks for Sequential Learning [22.095632118886225]
Sequential learning paradigms pose challenges for gradient-based deep learning due to difficulties incorporating new data and retaining prior knowledge.
We introduce a technique that converts neural networks from weight space to function space, through a dual parameterization.
Our experiments demonstrate that we can retain knowledge in continual learning and incorporate new data efficiently.
arXiv Detail & Related papers (2024-03-16T14:00:04Z) - Negotiated Representations to Prevent Forgetting in Machine Learning
Applications [0.0]
Catastrophic forgetting is a significant challenge in the field of machine learning.
We propose a novel method for preventing catastrophic forgetting in machine learning applications.
arXiv Detail & Related papers (2023-11-30T22:43:50Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental
Learning [32.52270964066876]
Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually.
We introduce a distillation algorithm to address the problem of FSCIL and propose to make use of semantic information during training.
arXiv Detail & Related papers (2021-03-06T08:07:26Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.