Dynamically Grown Generative Adversarial Networks
- URL: http://arxiv.org/abs/2106.08505v1
- Date: Wed, 16 Jun 2021 01:25:51 GMT
- Title: Dynamically Grown Generative Adversarial Networks
- Authors: Lanlan Liu, Yuting Zhang, Jia Deng, Stefano Soatto
- Abstract summary: We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation.
The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
- Score: 111.43128389995341
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work introduced progressive network growing as a promising way to ease
the training for large GANs, but the model design and architecture-growing
strategy still remain under-explored and needs manual design for different
image data. In this paper, we propose a method to dynamically grow a GAN during
training, optimizing the network architecture and its parameters together with
automation. The method embeds architecture search techniques as an interleaving
step with gradient-based training to periodically seek the optimal
architecture-growing strategy for the generator and discriminator. It enjoys
the benefits of both eased training because of progressive growing and improved
performance because of broader architecture design space. Experimental results
demonstrate new state-of-the-art of image generation. Observations in the
search procedure also provide constructive insights into the GAN model design
such as generator-discriminator balance and convolutional layer choices.
Related papers
- Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives.
We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis.
We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z) - Self Expanding Convolutional Neural Networks [1.4330085996657045]
We present a novel method for dynamically expanding Convolutional Neural Networks (CNNs) during training.
We employ a strategy where a single model is dynamically expanded, facilitating the extraction of checkpoints at various complexity levels.
arXiv Detail & Related papers (2024-01-11T06:22:40Z) - Designing Network Design Strategies Through Gradient Path Analysis [12.90962626557934]
This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis.
We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level.
arXiv Detail & Related papers (2022-11-09T10:51:57Z) - Incremental Learning with Differentiable Architecture and Forgetting
Search [3.6868861317674524]
We show that leveraging NAS for incremental learning results in strong performance gains for classification tasks.
We evaluate our method on both RF signal and image classification tasks, and demonstrate we can achieve up to a 10% performance increase over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-19T21:47:26Z) - RLFlow: Optimising Neural Network Subgraph Transformation with World
Models [0.0]
We propose a model-based agent which learns to optimise the architecture of neural networks by performing a sequence of subgraph transformations to reduce model runtime.
We show our approach can match the performance of state of the art on common convolutional networks and outperform those by up to 5% on transformer-style architectures.
arXiv Detail & Related papers (2022-05-03T11:52:54Z) - Neural Architecture Search for Speech Emotion Recognition [72.1966266171951]
We propose to apply neural architecture search (NAS) techniques to automatically configure the SER models.
We show that NAS can improve SER performance (54.89% to 56.28%) while maintaining model parameter sizes.
arXiv Detail & Related papers (2022-03-31T10:16:10Z) - A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly.
Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z) - Redefining Neural Architecture Search of Heterogeneous Multi-Network
Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models.
We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z) - DeshuffleGAN: A Self-Supervised GAN to Improve Structure Learning [0.0]
We argue that one of the crucial points to improve the GAN performance is to be able to provide the model with a capability to learn the spatial structure in data.
We introduce a deshuffling task that solves a puzzle of randomly shuffled image tiles, which in turn helps the DeshuffleGAN learn to increase its expressive capacity for spatial structure and realistic appearance.
arXiv Detail & Related papers (2020-06-15T19:06:07Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.