OvA-INN: Continual Learning with Invertible Neural Networks
- URL: http://arxiv.org/abs/2006.13772v1
- Date: Wed, 24 Jun 2020 14:40:05 GMT
- Title: OvA-INN: Continual Learning with Invertible Neural Networks
- Authors: G. Hocquet, O. Bichler, D. Querlioz
- Abstract summary: OvA-INN is able to learn one class at a time and without storing any of the previous data.
We show that we can take advantage of pretrained models by stacking an Invertible Network on top of a feature extractor.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of Continual Learning, the objective is to learn several tasks
one after the other without access to the data from previous tasks. Several
solutions have been proposed to tackle this problem but they usually assume
that the user knows which of the tasks to perform at test time on a particular
sample, or rely on small samples from previous data and most of them suffer of
a substantial drop in accuracy when updated with batches of only one class at a
time. In this article, we propose a new method, OvA-INN, which is able to learn
one class at a time and without storing any of the previous data. To achieve
this, for each class, we train a specific Invertible Neural Network to extract
the relevant features to compute the likelihood on this class. At test time, we
can predict the class of a sample by identifying the network which predicted
the highest likelihood. With this method, we show that we can take advantage of
pretrained models by stacking an Invertible Network on top of a feature
extractor. This way, we are able to outperform state-of-the-art approaches that
rely on features learning for the Continual Learning of MNIST and CIFAR-100
datasets. In our experiments, we reach 72% accuracy on CIFAR-100 after training
our model one class at a time.
Related papers
- Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning [56.29097276129473]
We propose a simple yet effective framework, named Learning Prompt with Distribution-based Feature Replay (LP-DiF)
To prevent the learnable prompt from forgetting old knowledge in the new session, we propose a pseudo-feature replay approach.
When progressing to a new session, pseudo-features are sampled from old-class distributions combined with training images of the current session to optimize the prompt.
arXiv Detail & Related papers (2024-01-03T07:59:17Z) - Prior-Free Continual Learning with Unlabeled Data in the Wild [24.14279172551939]
We propose a Prior-Free Continual Learning (PFCL) method to incrementally update a trained model on new tasks.
PFCL learns new tasks without knowing the task identity or any previous data.
Our experiments show that our PFCL method significantly mitigates forgetting in all three learning scenarios.
arXiv Detail & Related papers (2023-10-16T13:59:56Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - A Simple Baseline that Questions the Use of Pretrained-Models in
Continual Learning [30.023047201419825]
Some methods design continual learning mechanisms on the pre-trained representations and only allow minimum updates or even no updates of the backbone models during the training of continual learning.
We argue that the pretrained feature extractor itself can be strong enough to achieve a competitive or even better continual learning performance on Split-CIFAR100 and CoRe 50 benchmarks.
This baseline achieved 88.53% on 10-Split-CIFAR-100, surpassing most state-of-the-art continual learning methods that are all using the same pretrained transformer model.
arXiv Detail & Related papers (2022-10-10T04:19:53Z) - A Multi-Head Model for Continual Learning via Out-of-Distribution Replay [16.189891444511755]
Many approaches have been proposed to deal with catastrophic forgetting (CF) in continual learning (CL)
This paper proposes an entirely different approach that builds a separate classifier (head) for each task (called a multi-head model) using a transformer network, called MORE.
arXiv Detail & Related papers (2022-08-20T19:17:12Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Class-incremental Learning using a Sequence of Partial Implicitly
Regularized Classifiers [0.0]
In class-incremental learning, the objective is to learn a number of classes sequentially without having access to the whole training data.
Our experiments on CIFAR100 dataset show that the proposed method improves the performance over SOTA by a large margin.
arXiv Detail & Related papers (2021-04-04T10:02:45Z) - Deep Active Learning via Open Set Recognition [0.0]
In many applications, data is easy to acquire but expensive and time-consuming to label prominent examples.
We formulate active learning as an open-set recognition problem.
Unlike current active learning methods, our algorithm can learn tasks without the need for task labels.
arXiv Detail & Related papers (2020-07-04T22:09:17Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - iTAML: An Incremental Task-Agnostic Meta-learning Approach [123.10294801296926]
Humans can continuously learn new knowledge as their experience grows.
Previous learning in deep neural networks can quickly fade out when they are trained on a new task.
We introduce a novel meta-learning approach that seeks to maintain an equilibrium between all encountered tasks.
arXiv Detail & Related papers (2020-03-25T21:42:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.