Related papers: Ensembling Off-the-shelf Models for GAN Training

Ensembling Off-the-shelf Models for GAN Training

URL: http://arxiv.org/abs/2112.09130v1
Date: Thu, 16 Dec 2021 18:59:50 GMT
Title: Ensembling Off-the-shelf Models for GAN Training
Authors: Nupur Kumari, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
Abstract summary: We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators. We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings. Our method can improve GAN training in both limited data and large-scale settings.
Score: 55.34705213104182
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The advent of large-scale training has produced a cornucopia of powerful visual recognition models. However, generative models, such as GANs, have traditionally been trained from scratch in an unsupervised manner. Can the collective "knowledge" from a large bank of pretrained vision models be leveraged to improve GAN training? If so, with so many models to choose from, which one(s) should be selected, and in what manner are they most effective? We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators. Notably, the particular subset of selected models greatly affects performance. We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings, choosing the most accurate model, and progressively adding it to the discriminator ensemble. Interestingly, our method can improve GAN training in both limited data and large-scale settings. Given only 10k training samples, our FID on LSUN Cat matches the StyleGAN2 trained on 1.6M images. On the full dataset, our method improves FID by 1.5x to 2x on cat, church, and horse categories of LSUN.

Related papers

Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Secrets of RLHF in Large Language Models Part II: Reward Modeling [134.97964938009588]
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset. We also introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses.
arXiv Detail & Related papers (2024-01-11T17:56:59Z)
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks [139.3768582233067]
Battle of the Backbones (BoB) is a benchmarking tool for neural network based computer vision systems. We find that vision transformers (ViTs) and self-supervised learning (SSL) are increasingly popular. In apples-to-apples comparisons on the same architectures and similarly sized pretraining datasets, we find that SSL backbones are highly competitive.
arXiv Detail & Related papers (2023-10-30T18:23:58Z)
Masked Diffusion Models Are Fast Distribution Learners [32.485235866596064]
Diffusion models are commonly trained to learn all fine-grained visual information from scratch. We show that it suffices to train a strong diffusion model by first pre-training the model to learn some primer distribution. Then the pre-trained model can be fine-tuned for various generation tasks efficiently.
arXiv Detail & Related papers (2023-06-20T08:02:59Z)
TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z)
eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception. Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency. We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z)
Improving the Generalization of Supervised Models [30.264601433216246]
In this paper, we propose a supervised learning setup that leverages the best of both worlds. We show that these three improvements lead to a more favorable trade-off between the IN1K training task and 13 transfer tasks.
arXiv Detail & Related papers (2022-06-30T15:43:51Z)
Effective training-time stacking for ensembling of deep neural networks [1.2667973028134798]
A snapshot ensembling collects models in the ensemble along a single training path. Our method improves snapshot ensembling by selecting and weighting ensemble members along the training path. It relies on training-time likelihoods without looking at validation sample errors that standard stacking methods do.
arXiv Detail & Related papers (2022-06-27T17:52:53Z)
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones [40.33419553042038]
We propose to improve existing baseline networks via knowledge distillation from off-the-shelf pre-trained big powerful models. Our solution performs distillation by only driving prediction of the student model consistent with that of the teacher model. We empirically find that such simple distillation settings perform extremely effective, for example, the top-1 accuracy on ImageNet-1k validation set of MobileNetV3-large and ResNet50-D can be significantly improved.
arXiv Detail & Related papers (2021-03-10T09:32:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.