Frugal Incremental Generative Modeling using Variational Autoencoders
- URL: http://arxiv.org/abs/2505.22408v1
- Date: Wed, 28 May 2025 14:37:57 GMT
- Title: Frugal Incremental Generative Modeling using Variational Autoencoders
- Authors: Victor Enescu, Hichem Sahbi,
- Abstract summary: We develop a novel replay-free incremental learning model based on Variational Autoencoders (VAEs)<n>The proposed method considers two variants of these VAEs: static and dynamic with no (or at most a controlled) growth in the number of parameters.
- Score: 4.7881638074901955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual or incremental learning holds tremendous potential in deep learning with different challenges including catastrophic forgetting. The advent of powerful foundation and generative models has propelled this paradigm even further, making it one of the most viable solution to train these models. However, one of the persisting issues lies in the increasing volume of data particularly with replay-based methods. This growth introduces challenges with scalability since continuously expanding data becomes increasingly demanding as the number of tasks grows. In this paper, we attenuate this issue by devising a novel replay-free incremental learning model based on Variational Autoencoders (VAEs). The main contribution of this work includes (i) a novel incremental generative modelling, built upon a well designed multi-modal latent space, and also (ii) an orthogonality criterion that mitigates catastrophic forgetting of the learned VAEs. The proposed method considers two variants of these VAEs: static and dynamic with no (or at most a controlled) growth in the number of parameters. Extensive experiments show that our method is (at least) an order of magnitude more ``memory-frugal'' compared to the closely related works while achieving SOTA accuracy scores.
Related papers
- Parameter-Efficient Continual Fine-Tuning: A Survey [5.59258786465086]
We believe the next breakthrough in AI lies in enabling efficient adaptation to evolving environments.<n>One alternative to efficiently adapt these large-scale models is known.<n>Efficient Fine-Tuning (PEFT)
arXiv Detail & Related papers (2025-04-18T17:51:51Z) - ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning [24.8038863056542]
Large Language Models (LLMs) have demonstrated remarkable abilities in tackling a wide range of complex tasks.<n>Their huge computational and memory costs raise significant challenges in deploying these models on resource-constrained devices.<n>We introduce a different dynamic pruning method that pushes dense models to maintain a fixed number of active parameters.
arXiv Detail & Related papers (2025-01-25T20:01:42Z) - Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks.<n> scaling them to large graphs is challenging due to the high computational and storage costs.<n>We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z) - Data Augmentation with Variational Autoencoder for Imbalanced Dataset [1.2289361708127877]
Learning from an imbalanced distribution presents a major challenge in predictive modeling.<n>We develop a novel approach for generating data, combining VAE with a smoothed bootstrap, specifically designed to address the challenges of IR.
arXiv Detail & Related papers (2024-12-09T22:59:03Z) - Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.<n>This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.<n>We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - A Multi-step Loss Function for Robust Learning of the Dynamics in
Model-based Reinforcement Learning [10.940666275830052]
In model-based reinforcement learning, most algorithms rely on simulating trajectories from one-step models of the dynamics learned on data.
We tackle this issue by using a multi-step objective to train one-step models.
We find that this new loss is particularly useful when the data is noisy, which is often the case in real-life environments.
arXiv Detail & Related papers (2024-02-05T16:13:00Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.