Related papers: Dynamically Grown Generative Adversarial Networks

Dynamically Grown Generative Adversarial Networks

URL: http://arxiv.org/abs/2106.08505v1
Date: Wed, 16 Jun 2021 01:25:51 GMT
Title: Dynamically Grown Generative Adversarial Networks
Authors: Lanlan Liu, Yuting Zhang, Jia Deng, Stefano Soatto
Abstract summary: We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation. The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
Score: 111.43128389995341
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work introduced progressive network growing as a promising way to ease the training for large GANs, but the model design and architecture-growing strategy still remain under-explored and needs manual design for different image data. In this paper, we propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation. The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator. It enjoys the benefits of both eased training because of progressive growing and improved performance because of broader architecture design space. Experimental results demonstrate new state-of-the-art of image generation. Observations in the search procedure also provide constructive insights into the GAN model design such as generator-discriminator balance and convolutional layer choices.

Related papers

Evolution Meets Diffusion: Efficient Neural Architecture Generation [1.8284471682448833]
Neural Architecture Search (NAS) has gained widespread attention for its transformative potential in deep learning model design. We propose Evolutionary Diffusion-based Neural Architecture Generation (EDNAG), a novel approach that achieves efficient and training-free architecture generation. EDNAG achieves state-of-the-art (SOTA) performance in architecture optimization, with an improvement of up to 10.45%. It eliminates the need for time-consuming training and boosts inference speed by an average of 50 times, showcasing its exceptional efficiency and effectiveness.
arXiv Detail & Related papers (2025-04-24T03:09:04Z)
Instruction-Guided Autoregressive Neural Network Parameter Generation [49.800239140036496]
We propose IGPG, an autoregressive framework that unifies parameter synthesis across diverse tasks and architectures. By autoregressively generating neural network weights' tokens, IGPG ensures inter-layer coherence and enables efficient adaptation across models and datasets. Experiments on multiple datasets demonstrate that IGPG consolidates diverse pretrained models into a single, flexible generative framework.
arXiv Detail & Related papers (2025-04-02T05:50:19Z)
Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
From Noise to Nuance: Advances in Deep Generative Image Models [8.802499769896192]
Deep learning-based image generation has undergone a paradigm shift since 2021. Recent developments in Stable Diffusion, DALL-E, and consistency models have redefined the capabilities and performance boundaries of image synthesis. We investigate how enhanced multi-modal understanding and zero-shot generation capabilities are reshaping practical applications across industries.
arXiv Detail & Related papers (2024-12-12T02:09:04Z)
STAR: Synthesis of Tailored Architectures [61.080157488857516]
We propose a new approach for the synthesis of tailored architectures (STAR) Our approach combines a novel search space based on the theory of linear input-varying systems, supporting a hierarchical numerical encoding into architecture genomes. STAR genomes are automatically refined and recombined with gradient-free, evolutionary algorithms to optimize for multiple model quality and efficiency metrics. Using STAR, we optimize large populations of new architectures, leveraging diverse computational units and interconnection patterns, improving over highly-optimized Transformers and striped hybrid models on the frontier of quality, parameter size, and inference cache for autoregressive language modeling.
arXiv Detail & Related papers (2024-11-26T18:42:42Z)
Exploring the design space of deep-learning-based weather forecasting systems [56.129148006412855]
This paper systematically analyzes the impact of different design choices on deep-learning-based weather forecasting systems. We study fixed-grid architectures such as UNet, fully convolutional architectures, and transformer-based models. We propose a hybrid system that combines the strong performance of fixed-grid models with the flexibility of grid-invariant architectures.
arXiv Detail & Related papers (2024-10-09T22:25:50Z)
EM-DARTS: Hierarchical Differentiable Architecture Search for Eye Movement Recognition [54.99121380536659]
Eye movement biometrics have received increasing attention thanks to its high secure identification. Deep learning (DL) models have been recently successfully applied for eye movement recognition. DL architecture still is determined by human prior knowledge. We propose EM-DARTS, a hierarchical differentiable architecture search algorithm to automatically design the DL architecture for eye movement recognition.
arXiv Detail & Related papers (2024-09-22T13:11:08Z)
Self Expanding Convolutional Neural Networks [1.4330085996657045]
We present a novel method for dynamically expanding Convolutional Neural Networks (CNNs) during training. We employ a strategy where a single model is dynamically expanded, facilitating the extraction of checkpoints at various complexity levels.
arXiv Detail & Related papers (2024-01-11T06:22:40Z)
Designing Network Design Strategies Through Gradient Path Analysis [12.90962626557934]
This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis. We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level.
arXiv Detail & Related papers (2022-11-09T10:51:57Z)
Incremental Learning with Differentiable Architecture and Forgetting Search [3.6868861317674524]
We show that leveraging NAS for incremental learning results in strong performance gains for classification tasks. We evaluate our method on both RF signal and image classification tasks, and demonstrate we can achieve up to a 10% performance increase over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-19T21:47:26Z)
RLFlow: Optimising Neural Network Subgraph Transformation with World Models [0.0]
We propose a model-based agent which learns to optimise the architecture of neural networks by performing a sequence of subgraph transformations to reduce model runtime. We show our approach can match the performance of state of the art on common convolutional networks and outperform those by up to 5% on transformer-style architectures.
arXiv Detail & Related papers (2022-05-03T11:52:54Z)
Neural Architecture Search for Speech Emotion Recognition [72.1966266171951]
We propose to apply neural architecture search (NAS) techniques to automatically configure the SER models. We show that NAS can improve SER performance (54.89% to 56.28%) while maintaining model parameter sizes.
arXiv Detail & Related papers (2022-03-31T10:16:10Z)
A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly. Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z)
Redefining Neural Architecture Search of Heterogeneous Multi-Network Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models. We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z)
DeshuffleGAN: A Self-Supervised GAN to Improve Structure Learning [0.0]
We argue that one of the crucial points to improve the GAN performance is to be able to provide the model with a capability to learn the spatial structure in data. We introduce a deshuffling task that solves a puzzle of randomly shuffled image tiles, which in turn helps the DeshuffleGAN learn to increase its expressive capacity for spatial structure and realistic appearance.
arXiv Detail & Related papers (2020-06-15T19:06:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.