Piggyback GAN: Efficient Lifelong Learning for Image Conditioned
Generation
- URL: http://arxiv.org/abs/2104.11939v1
- Date: Sat, 24 Apr 2021 12:09:52 GMT
- Title: Piggyback GAN: Efficient Lifelong Learning for Image Conditioned
Generation
- Authors: Mengyao Zhai, Lei Chen, Jiawei He, Megha Nawhal, Frederick Tung, Greg
Mori
- Abstract summary: We propose a parameter efficient framework, Piggyback GAN, which learns the current task by building a set of convolutional and deconvolutional filters.
For the current task, our model achieves high generation quality on par with a standalone model at a lower number of parameters.
We validate Piggyback GAN on various image-conditioned generation tasks across different domains.
- Score: 48.31909975623379
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans accumulate knowledge in a lifelong fashion. Modern deep neural
networks, on the other hand, are susceptible to catastrophic forgetting: when
adapted to perform new tasks, they often fail to preserve their performance on
previously learned tasks. Given a sequence of tasks, a naive approach
addressing catastrophic forgetting is to train a separate standalone model for
each task, which scales the total number of parameters drastically without
efficiently utilizing previous models. In contrast, we propose a parameter
efficient framework, Piggyback GAN, which learns the current task by building a
set of convolutional and deconvolutional filters that are factorized into
filters of the models trained on previous tasks. For the current task, our
model achieves high generation quality on par with a standalone model at a
lower number of parameters. For previous tasks, our model can also preserve
generation quality since the filters for previous tasks are not altered. We
validate Piggyback GAN on various image-conditioned generation tasks across
different domains, and provide qualitative and quantitative results to show
that the proposed approach can address catastrophic forgetting effectively and
efficiently.
Related papers
- One-Shot Pruning for Fast-adapting Pre-trained Models on Devices [28.696989086706186]
Large-scale pre-trained models have been remarkably successful in resolving downstream tasks.
deploying these models on low-capability devices still requires an effective approach, such as model pruning.
We present a scalable one-shot pruning method that leverages pruned knowledge of similar tasks to extract a sub-network from the pre-trained model for a new task.
arXiv Detail & Related papers (2023-07-10T06:44:47Z) - Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for
Downstream Tasks [55.431048995662714]
We create a small model for a new task from the pruned models of similar tasks.
We show that a few fine-tuning steps on this model suffice to produce a promising pruned-model for the new task.
We develop a simple but effective ''Meta-Vote Pruning (MVP)'' method that significantly reduces the pruning iterations for a new task.
arXiv Detail & Related papers (2023-01-27T06:49:47Z) - Parameter-Efficient Image-to-Video Transfer Learning [66.82811235484607]
Large pre-trained models for various downstream tasks of interest have recently emerged with promising performance.
Due to the ever-growing model size, the standard full fine-tuning based task adaptation strategy becomes costly in terms of model training and storage.
We propose a new Spatio-Adapter for parameter-efficient fine-tuning per video task.
arXiv Detail & Related papers (2022-06-27T18:02:29Z) - muNet: Evolving Pretrained Deep Neural Networks into Scalable
Auto-tuning Multitask Systems [4.675744559395732]
Most uses of machine learning today involve training a model from scratch for a particular task, or starting with a model pretrained on a related task and then fine-tuning on a downstream task.
We propose a method that uses the layers of a pretrained deep neural network as building blocks to construct an ML system that can jointly solve an arbitrary number of tasks.
The resulting system can leverage cross tasks knowledge transfer, while being immune from common drawbacks of multitask approaches such as catastrophic forgetting, gradients interference and negative transfer.
arXiv Detail & Related papers (2022-05-22T21:54:33Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Shared and Private VAEs with Generative Replay for Continual Learning [1.90365714903665]
Continual learning tries to learn new tasks without forgetting previously learned ones.
Most of the existing artificial neural network(ANN) models fail, while humans do the same by remembering previous works throughout their life.
We show our hybrid model effectively avoids forgetting and achieves state-of-the-art results on visual continual learning benchmarks such as MNIST, Permuted MNIST(QMNIST), CIFAR100, and miniImageNet datasets.
arXiv Detail & Related papers (2021-05-17T06:18:36Z) - Efficient Continual Adaptation for Generative Adversarial Networks [97.20244383723853]
We present a continual learning approach for generative adversarial networks (GANs)
Our approach is based on learning a set of global and task-specific parameters.
We show that the feature-map transformation based approach outperforms state-of-the-art continual GANs methods.
arXiv Detail & Related papers (2021-03-06T05:09:37Z) - Conditional Channel Gated Networks for Task-Aware Continual Learning [44.894710899300435]
Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems.
We introduce a novel framework to tackle this problem with conditional computation.
We validate our proposal on four continual learning datasets.
arXiv Detail & Related papers (2020-03-31T19:35:07Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.