Shared and Private VAEs with Generative Replay for Continual Learning
- URL: http://arxiv.org/abs/2105.07627v1
- Date: Mon, 17 May 2021 06:18:36 GMT
- Title: Shared and Private VAEs with Generative Replay for Continual Learning
- Authors: Subhankar Ghosh
- Abstract summary: Continual learning tries to learn new tasks without forgetting previously learned ones.
Most of the existing artificial neural network(ANN) models fail, while humans do the same by remembering previous works throughout their life.
We show our hybrid model effectively avoids forgetting and achieves state-of-the-art results on visual continual learning benchmarks such as MNIST, Permuted MNIST(QMNIST), CIFAR100, and miniImageNet datasets.
- Score: 1.90365714903665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning tries to learn new tasks without forgetting previously
learned ones. In reality, most of the existing artificial neural network(ANN)
models fail, while humans do the same by remembering previous works throughout
their life. Although simply storing all past data can alleviate the problem, it
needs large memory and often infeasible in real-world applications where last
data access is limited. We hypothesize that the model that learns to solve each
task continually has some task-specific properties and some task-invariant
characteristics. We propose a hybrid continual learning model that is more
suitable in real case scenarios to address the issues that has a task-invariant
shared variational autoencoder and T task-specific variational autoencoders.
Our model combines generative replay and architectural growth to prevent
catastrophic forgetting. We show our hybrid model effectively avoids forgetting
and achieves state-of-the-art results on visual continual learning benchmarks
such as MNIST, Permuted MNIST(QMNIST), CIFAR100, and miniImageNet datasets. We
discuss results on a few more datasets, such as SVHN, Fashion-MNIST, EMNIST,
and CIFAR10.
Related papers
- Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Exploring intra-task relations to improve meta-learning algorithms [1.223779595809275]
We aim to exploit external knowledge of task relations to improve training stability via effective mini-batching of tasks.
We hypothesize that selecting a diverse set of tasks in a mini-batch will lead to a better estimate of the full gradient and hence will lead to a reduction of noise in training.
arXiv Detail & Related papers (2023-12-27T15:33:52Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Preventing Catastrophic Forgetting in Continual Learning of New Natural
Language Tasks [17.879087904904935]
Multi-Task Learning (MTL) is widely-accepted in Natural Language Processing as a standard technique for learning multiple related tasks in one model.
As systems usually evolve over time, adding a new task to an existing MTL model usually requires retraining the model from scratch on all the tasks.
In this paper, we approach the problem of incrementally expanding MTL models' capability to solve new tasks over time by distilling the knowledge of an already trained model on n tasks into a new one for solving n+1 tasks.
arXiv Detail & Related papers (2023-02-22T00:18:25Z) - Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for
Downstream Tasks [55.431048995662714]
We create a small model for a new task from the pruned models of similar tasks.
We show that a few fine-tuning steps on this model suffice to produce a promising pruned-model for the new task.
We develop a simple but effective ''Meta-Vote Pruning (MVP)'' method that significantly reduces the pruning iterations for a new task.
arXiv Detail & Related papers (2023-01-27T06:49:47Z) - Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification.
The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting.
We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z) - Adversarial Training of Variational Auto-encoders for Continual
Zero-shot Learning [1.90365714903665]
We present a hybrid network that consists of a shared VAE module to hold information of all tasks and task-specific private VAE modules for each task.
The model's size grows with each task to prevent catastrophic forgetting of task-specific skills.
We show our method is superior on class sequentially learning with ZSL(Zero-Shot Learning) and GZSL(Generalized Zero-Shot Learning)
arXiv Detail & Related papers (2021-02-07T11:21:24Z) - Continual Learning Using Multi-view Task Conditional Neural Networks [6.27221711890162]
Conventional deep learning models have limited capacity in learning multiple tasks sequentially.
We propose Multi-view Task Conditional Neural Networks (Mv-TCNN) that does not require to known the reoccurring tasks in advance.
arXiv Detail & Related papers (2020-05-08T01:03:30Z) - iTAML: An Incremental Task-Agnostic Meta-learning Approach [123.10294801296926]
Humans can continuously learn new knowledge as their experience grows.
Previous learning in deep neural networks can quickly fade out when they are trained on a new task.
We introduce a novel meta-learning approach that seeks to maintain an equilibrium between all encountered tasks.
arXiv Detail & Related papers (2020-03-25T21:42:48Z) - Lifelong Learning with Searchable Extension Units [21.17631355880764]
We propose a new lifelong learning framework named Searchable Extension Units (SEU)
It breaks down the need for a predefined original model and searches for specific extension units for different tasks.
Our approach can obtain a much more compact model without catastrophic forgetting.
arXiv Detail & Related papers (2020-03-19T03:45:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.