Adapt & Align: Continual Learning with Generative Models Latent Space
Alignment
- URL: http://arxiv.org/abs/2312.13699v1
- Date: Thu, 21 Dec 2023 10:02:17 GMT
- Title: Adapt & Align: Continual Learning with Generative Models Latent Space
Alignment
- Authors: Kamil Deja, Bartosz Cywi\'nski, Jan Rybarczyk, Tomasz Trzci\'nski
- Abstract summary: We introduce Adapt & Align, a method for continual learning of neural networks by aligning latent representations in generative models.
Neural Networks suffer from abrupt loss in performance when retrained with additional data.
We propose a new method that mitigates those problems by employing generative models and splitting the process of their update into two parts.
- Score: 15.729732755625474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce Adapt & Align, a method for continual learning of
neural networks by aligning latent representations in generative models. Neural
Networks suffer from abrupt loss in performance when retrained with additional
training data from different distributions. At the same time, training with
additional data without access to the previous examples rarely improves the
model's performance. In this work, we propose a new method that mitigates those
problems by employing generative models and splitting the process of their
update into two parts. In the first one, we train a local generative model
using only data from a new task. In the second phase, we consolidate latent
representations from the local model with a global one that encodes knowledge
of all past experiences. We introduce our approach with Variational
Auteoncoders and Generative Adversarial Networks. Moreover, we show how we can
use those generative models as a general method for continual knowledge
consolidation that can be used in downstream tasks such as classification.
Related papers
- BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - Diffusion-based Neural Network Weights Generation [85.6725307453325]
We propose an efficient and adaptive transfer learning scheme through dataset-conditioned pretrained weights sampling.
Specifically, we use a latent diffusion model with a variational autoencoder that can reconstruct the neural network weights.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Learning to Jump: Thinning and Thickening Latent Counts for Generative
Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data.
We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z) - Adversarial Learning Networks: Source-free Unsupervised Domain
Incremental Learning [0.0]
In a non-stationary environment, updating a DNN model requires parameter re-training or model fine-tuning.
We propose an unsupervised source-free method to update DNN classification models.
Unlike existing methods, our approach can update a DNN model incrementally for non-stationary source and target tasks without storing past training data.
arXiv Detail & Related papers (2023-01-28T02:16:13Z) - Cooperative data-driven modeling [44.99833362998488]
Data-driven modeling in mechanics is evolving rapidly based on recent machine learning advances.
New data and models created by different groups become available, opening possibilities for cooperative modeling.
Artificial neural networks suffer from catastrophic forgetting, i.e. they forget how to perform an old task when trained on a new one.
This hinders cooperation because adapting an existing model for a new task affects the performance on a previous task trained by someone else.
arXiv Detail & Related papers (2022-11-23T14:27:25Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Transfer Learning via Test-Time Neural Networks Aggregation [11.42582922543676]
It has been demonstrated that deep neural networks outperform traditional machine learning.
Deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution.
arXiv Detail & Related papers (2022-06-27T15:46:05Z) - GAN Cocktail: mixing GANs without dataset access [18.664733153082146]
We tackle the problem of model merging, given two constraints that often come up in the real world.
In the first stage, we transform the weights of all the models to the same parameter space by a technique we term model rooting.
In the second stage, we merge the rooted models by averaging their weights and fine-tuning them for each specific domain, using only data generated by the original trained models.
arXiv Detail & Related papers (2021-06-07T17:59:04Z) - Streaming Graph Neural Networks via Continual Learning [31.810308087441445]
Graph neural networks (GNNs) have achieved strong performance in various applications.
In this paper, we propose a streaming GNN model based on continual learning.
We show that our model can efficiently update model parameters and achieve comparable performance to model retraining.
arXiv Detail & Related papers (2020-09-23T06:52:30Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.