Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in
Continual Learning
- URL: http://arxiv.org/abs/2106.12196v1
- Date: Wed, 23 Jun 2021 06:58:40 GMT
- Title: Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in
Continual Learning
- Authors: Kamil Deja, Pawe{\l} Wawrzy\'nski, Daniel Marczak, Wojciech Masarczyk,
Tomasz Trzci\'nski
- Abstract summary: Acquiring knowledge about new data samples without forgetting previous ones is a critical problem of continual learning.
We propose a new method for unsupervised continual knowledge consolidation in generative models that relies on the partitioning of Variational Autoencoder's latent space.
On top of the standard continual learning evaluation benchmarks, we evaluate our method on a new knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold.
- Score: 14.226973149346883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new method for unsupervised continual knowledge consolidation in
generative models that relies on the partitioning of Variational Autoencoder's
latent space. Acquiring knowledge about new data samples without forgetting
previous ones is a critical problem of continual learning. Currently proposed
methods achieve this goal by extending the existing model while constraining
its behavior not to degrade on the past data, which does not exploit the full
potential of relations within the entire training dataset. In this work, we
identify this limitation and posit the goal of continual learning as a
knowledge accumulation task. We solve it by continuously re-aligning latent
space partitions that we call bands which are representations of samples seen
in different tasks, driven by the similarity of the information they contain.
In addition, we introduce a simple yet effective method for controlled
forgetting of past data that improves the quality of reconstructions encoded in
latent bands and a latent space disentanglement technique that improves
knowledge consolidation. On top of the standard continual learning evaluation
benchmarks, we evaluate our method on a new knowledge consolidation scenario
and show that the proposed approach outperforms state-of-the-art by up to
twofold across all testing scenarios.
Related papers
- Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods.
MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections.
Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z) - Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - Learning to Retain while Acquiring: Combating Distribution-Shift in
Adversarial Data-Free Knowledge Distillation [31.294947552032088]
Data-free Knowledge Distillation (DFKD) has gained popularity recently, with the fundamental idea of carrying out knowledge transfer from a Teacher to a Student neural network in the absence of training data.
We propose a meta-learning inspired framework by treating the task of Knowledge-Acquisition (learning from newly generated samples) and Knowledge-Retention (retaining knowledge on previously met samples) as meta-train and meta-test.
arXiv Detail & Related papers (2023-02-28T03:50:56Z) - Online Continual Learning via the Meta-learning Update with Multi-scale
Knowledge Distillation and Data Augmentation [4.109784267309124]
Continual learning aims to rapidly and continually learn the current task from a sequence of tasks.
One common limitation of this method is the data imbalance between the previous and current tasks.
We propose a novel framework called Meta-learning update via Multi-scale Knowledge Distillation and Data Augmentation.
arXiv Detail & Related papers (2022-09-12T10:03:53Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Continual Few-shot Relation Learning via Embedding Space Regularization
and Data Augmentation [4.111899441919165]
It is necessary for the model to learn novel relational patterns with very few labeled data while avoiding catastrophic forgetting of previous task knowledge.
We propose a novel method based on embedding space regularization and data augmentation.
Our method generalizes to new few-shot tasks and avoids catastrophic forgetting of previous tasks by enforcing extra constraints on the relational embeddings and by adding extra relevant data in a self-supervised manner.
arXiv Detail & Related papers (2022-03-04T05:19:09Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Continually Learning Self-Supervised Representations with Projected
Functional Regularization [39.92600544186844]
Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised methods.
These methods are unable to acquire new knowledge incrementally -- they are, in fact, mostly used only as a pre-training phase with IID data.
To prevent forgetting of previous knowledge, we propose the usage of functional regularization.
arXiv Detail & Related papers (2021-12-30T11:59:23Z) - Continual Learning From Unlabeled Data Via Deep Clustering [7.704949298975352]
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives.
We introduce a new framework to make continual learning feasible in unsupervised mode by using pseudo label obtained from cluster assignments to update model.
arXiv Detail & Related papers (2021-04-14T23:46:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.