Logarithmic Continual Learning
- URL: http://arxiv.org/abs/2201.06534v1
- Date: Mon, 17 Jan 2022 17:29:16 GMT
- Title: Logarithmic Continual Learning
- Authors: Wojciech Masarczyk, Pawe{\l} Wawrzy\'nski, Daniel Marczak, Kamil Deja,
Tomasz Trzci\'nski
- Abstract summary: We introduce a neural network architecture that logarithmically reduces the number of self-rehearsal steps in the generative rehearsal of continually learned models.
In continual learning (CL), training samples come in subsequent tasks, and the trained model can access only a single task at a time.
- Score: 11.367079056418957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a neural network architecture that logarithmically reduces the
number of self-rehearsal steps in the generative rehearsal of continually
learned models. In continual learning (CL), training samples come in subsequent
tasks, and the trained model can access only a single task at a time. To replay
previous samples, contemporary CL methods bootstrap generative models and train
them recursively with a combination of current and regenerated past data. This
recurrence leads to superfluous computations as the same past samples are
regenerated after each task, and the reconstruction quality successively
degrades. In this work, we address these limitations and propose a new
generative rehearsal architecture that requires at most logarithmic number of
retraining for each sample. Our approach leverages allocation of past data in
a~set of generative models such that most of them do not require retraining
after a~task. The experimental evaluation of our logarithmic continual learning
approach shows the superiority of our method with respect to the
state-of-the-art generative rehearsal methods.
Related papers
- Joint Diffusion models in Continual Learning [4.013156524547073]
We introduce JDCL - a new method for continual learning with generative rehearsal based on joint diffusion models.
Generative-replay-based continual learning methods try to mitigate this issue by retraining a model with a combination of new and rehearsal data sampled from a generative model.
We show that such shared parametrization, combined with the knowledge distillation technique allows for stable adaptation to new tasks without catastrophic forgetting.
arXiv Detail & Related papers (2024-11-12T22:35:44Z) - One Step Diffusion via Shortcut Models [109.72495454280627]
We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples.
Shortcut models condition the network on the current noise level and also on the desired step size, allowing the model to skip ahead in the generation process.
Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.
arXiv Detail & Related papers (2024-10-16T13:34:40Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Continual Learning with Strong Experience Replay [32.154995019080594]
We propose a CL method with Strong Experience Replay (SER)
SER utilizes future experiences mimicked on the current training data, besides distilling past experience from the memory buffer.
Experimental results on multiple image classification datasets show that our SER method surpasses the state-of-the-art methods by a noticeable margin.
arXiv Detail & Related papers (2023-05-23T02:42:54Z) - Solving Inverse Problems with Score-Based Generative Priors learned from
Noisy Data [1.7969777786551424]
SURE-Score is an approach for learning score-based generative models using training samples corrupted by additive Gaussian noise.
We demonstrate the generality of SURE-Score by learning priors and applying posterior sampling to ill-posed inverse problems in two practical applications.
arXiv Detail & Related papers (2023-05-02T02:51:01Z) - Reducing Training Sample Memorization in GANs by Training with
Memorization Rejection [80.0916819303573]
We propose rejection memorization, a training scheme that rejects generated samples that are near-duplicates of training samples during training.
Our scheme is simple, generic and can be directly applied to any GAN architecture.
arXiv Detail & Related papers (2022-10-21T20:17:50Z) - Effective and Efficient Training for Sequential Recommendation using
Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective.
We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z) - Few-shot Transfer Learning for Holographic Image Reconstruction using a
Recurrent Neural Network [0.30586855806896046]
We show a few-shot transfer learning method that helps a holographic image reconstruction deep neural network rapidly generalize to new types of samples using small datasets.
We validated the effectiveness of this approach by successfully generalizing to new types of samples using small holographic datasets for training.
arXiv Detail & Related papers (2022-01-27T05:51:36Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.