Premonition: Using Generative Models to Preempt Future Data Changes in
Continual Learning
- URL: http://arxiv.org/abs/2403.07356v1
- Date: Tue, 12 Mar 2024 06:29:54 GMT
- Title: Premonition: Using Generative Models to Preempt Future Data Changes in
Continual Learning
- Authors: Mark D. McDonnell, Dong Gong, Ehsan Abbasnejad and Anton van den
Hengel
- Abstract summary: Continual learning requires a model to adapt to ongoing changes in the data distribution.
We show that the combination of a large language model and an image generation model can similarly provide useful premonitions.
We find that the backbone of our pre-trained networks can learn representations useful for the downstream continual learning problem.
- Score: 63.850451635362425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning requires a model to adapt to ongoing changes in the data
distribution, and often to the set of tasks to be performed. It is rare,
however, that the data and task changes are completely unpredictable. Given a
description of an overarching goal or data theme, which we call a realm, humans
can often guess what concepts are associated with it. We show here that the
combination of a large language model and an image generation model can
similarly provide useful premonitions as to how a continual learning challenge
might develop over time. We use the large language model to generate text
descriptions of semantically related classes that might potentially appear in
the data stream in future. These descriptions are then rendered using Stable
Diffusion to generate new labelled image samples. The resulting synthetic
dataset is employed for supervised pre-training, but is discarded prior to
commencing continual learning, along with the pre-training classification head.
We find that the backbone of our pre-trained networks can learn representations
useful for the downstream continual learning problem, thus becoming a valuable
input to any existing continual learning method. Although there are
complexities arising from the domain gap between real and synthetic images, we
show that pre-training models in this manner improves multiple Class Incremenal
Learning (CIL) methods on fine-grained image classification benchmarks.
Supporting code can be found at https://github.com/cl-premonition/premonition.
Related papers
- Make Prompts Adaptable: Bayesian Modeling for Vision-Language Prompt
Learning with Data-Dependent Prior [14.232144691524528]
Recent Vision-Language Pretrained models have become the backbone for many downstream tasks.
MLE training can lead the context vector to over-fit dominant image features in the training data.
This paper presents a Bayesian-based framework of prompt learning, which could alleviate the overfitting issues on few-shot learning application.
arXiv Detail & Related papers (2024-01-09T10:15:59Z) - DreamTeacher: Pretraining Image Backbones with Deep Generative Models [103.62397699392346]
We introduce a self-supervised feature representation learning framework that utilizes generative networks for pre-training downstream image backbones.
We investigate two types of knowledge distillation: 1) distilling learned generative features onto target image backbones as an alternative to pretraining these backbones on large labeled datasets such as ImageNet.
We empirically find that our DreamTeacher significantly outperforms existing self-supervised representation learning approaches across the board.
arXiv Detail & Related papers (2023-07-14T17:17:17Z) - Learning to Jump: Thinning and Thickening Latent Counts for Generative
Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data.
We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z) - Generative Negative Text Replay for Continual Vision-Language
Pretraining [95.2784858069843]
Vision-language pre-training has attracted increasing attention recently.
Massive data are usually collected in a streaming fashion.
We propose a multi-modal knowledge distillation between images and texts to align the instance-wise prediction between old and new models.
arXiv Detail & Related papers (2022-10-31T13:42:21Z) - DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting [91.56988987393483]
We present a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP.
Specifically, we convert the original image-text matching problem in CLIP to a pixel-text matching problem and use the pixel-text score maps to guide the learning of dense prediction models.
Our method is model-agnostic, which can be applied to arbitrary dense prediction systems and various pre-trained visual backbones.
arXiv Detail & Related papers (2021-12-02T18:59:32Z) - Growing Representation Learning [2.7231362265267127]
We develop an attention based Gaussian Mixture, called GMAT, that learns interpretable representations of data with or without labels.
We show that our method is capable learning new representations of data without labels or assumptions about the distributions of labels.
arXiv Detail & Related papers (2021-10-17T15:55:13Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z) - A Survey on Self-supervised Pre-training for Sequential Transfer
Learning in Neural Networks [1.1802674324027231]
Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data.
We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains.
arXiv Detail & Related papers (2020-07-01T22:55:48Z) - Conditional Mutual information-based Contrastive Loss for Financial Time
Series Forecasting [12.0855096102517]
We present a representation learning framework for financial time series forecasting.
In this paper, we propose to first learn compact representations from time series data, then use the learned representations to train a simpler model for predicting time series movements.
arXiv Detail & Related papers (2020-02-18T15:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.