Ensembling Off-the-shelf Models for GAN Training
- URL: http://arxiv.org/abs/2112.09130v1
- Date: Thu, 16 Dec 2021 18:59:50 GMT
- Title: Ensembling Off-the-shelf Models for GAN Training
- Authors: Nupur Kumari, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
- Abstract summary: We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators.
We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings.
Our method can improve GAN training in both limited data and large-scale settings.
- Score: 55.34705213104182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of large-scale training has produced a cornucopia of powerful
visual recognition models. However, generative models, such as GANs, have
traditionally been trained from scratch in an unsupervised manner. Can the
collective "knowledge" from a large bank of pretrained vision models be
leveraged to improve GAN training? If so, with so many models to choose from,
which one(s) should be selected, and in what manner are they most effective? We
find that pretrained computer vision models can significantly improve
performance when used in an ensemble of discriminators. Notably, the particular
subset of selected models greatly affects performance. We propose an effective
selection mechanism, by probing the linear separability between real and fake
samples in pretrained model embeddings, choosing the most accurate model, and
progressively adding it to the discriminator ensemble. Interestingly, our
method can improve GAN training in both limited data and large-scale settings.
Given only 10k training samples, our FID on LSUN Cat matches the StyleGAN2
trained on 1.6M images. On the full dataset, our method improves FID by 1.5x to
2x on cat, church, and horse categories of LSUN.
Related papers
- Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Secrets of RLHF in Large Language Models Part II: Reward Modeling [134.97964938009588]
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset.
We also introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses.
arXiv Detail & Related papers (2024-01-11T17:56:59Z) - Battle of the Backbones: A Large-Scale Comparison of Pretrained Models
across Computer Vision Tasks [139.3768582233067]
Battle of the Backbones (BoB) is a benchmarking tool for neural network based computer vision systems.
We find that vision transformers (ViTs) and self-supervised learning (SSL) are increasingly popular.
In apples-to-apples comparisons on the same architectures and similarly sized pretraining datasets, we find that SSL backbones are highly competitive.
arXiv Detail & Related papers (2023-10-30T18:23:58Z) - Masked Diffusion Models Are Fast Distribution Learners [32.485235866596064]
Diffusion models are commonly trained to learn all fine-grained visual information from scratch.
We show that it suffices to train a strong diffusion model by first pre-training the model to learn some primer distribution.
Then the pre-trained model can be fine-tuned for various generation tasks efficiently.
arXiv Detail & Related papers (2023-06-20T08:02:59Z) - TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - Improving the Generalization of Supervised Models [30.264601433216246]
In this paper, we propose a supervised learning setup that leverages the best of both worlds.
We show that these three improvements lead to a more favorable trade-off between the IN1K training task and 13 transfer tasks.
arXiv Detail & Related papers (2022-06-30T15:43:51Z) - Effective training-time stacking for ensembling of deep neural networks [1.2667973028134798]
A snapshot ensembling collects models in the ensemble along a single training path.
Our method improves snapshot ensembling by selecting and weighting ensemble members along the training path.
It relies on training-time likelihoods without looking at validation sample errors that standard stacking methods do.
arXiv Detail & Related papers (2022-06-27T17:52:53Z) - Beyond Self-Supervision: A Simple Yet Effective Network Distillation
Alternative to Improve Backbones [40.33419553042038]
We propose to improve existing baseline networks via knowledge distillation from off-the-shelf pre-trained big powerful models.
Our solution performs distillation by only driving prediction of the student model consistent with that of the teacher model.
We empirically find that such simple distillation settings perform extremely effective, for example, the top-1 accuracy on ImageNet-1k validation set of MobileNetV3-large and ResNet50-D can be significantly improved.
arXiv Detail & Related papers (2021-03-10T09:32:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.