MobileStyleGAN: A Lightweight Convolutional Neural Network for
High-Fidelity Image Synthesis
- URL: http://arxiv.org/abs/2104.04767v1
- Date: Sat, 10 Apr 2021 13:46:49 GMT
- Title: MobileStyleGAN: A Lightweight Convolutional Neural Network for
High-Fidelity Image Synthesis
- Authors: Sergei Belousov
- Abstract summary: We focus on the performance optimization of style-based generative models.
We introduce MobileStyleGAN architecture, which has x3.5 fewer parameters and is x9.5 less computationally complex than StyleGAN2.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, the use of Generative Adversarial Networks (GANs) has become
very popular in generative image modeling. While style-based GAN architectures
yield state-of-the-art results in high-fidelity image synthesis,
computationally, they are highly complex. In our work, we focus on the
performance optimization of style-based generative models. We analyze the most
computationally hard parts of StyleGAN2, and propose changes in the generator
network to make it possible to deploy style-based generative networks in the
edge devices. We introduce MobileStyleGAN architecture, which has x3.5 fewer
parameters and is x9.5 less computationally complex than StyleGAN2, while
providing comparable quality.
Related papers
- Efficient generative adversarial networks using linear additive-attention Transformers [0.8287206589886879]
We present a novel GAN architecture based on a linear attention Transformer block named Ladaformer.
LadaGAN consistently outperforms existing convolutional and Transformer GANs on benchmark datasets at different resolutions.
LadaGAN shows competitive performance compared to state-of-the-art multi-step generative models.
arXiv Detail & Related papers (2024-01-17T21:08:41Z) - RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks [93.18404922542702]
We present a novel video generative model designed to address long-term spatial and temporal dependencies.
Our approach incorporates a hybrid explicit-implicit tri-plane representation inspired by 3D-aware generative frameworks.
Our model synthesizes high-fidelity video clips at a resolution of $256times256$ pixels, with durations extending to more than $5$ seconds at a frame rate of 30 fps.
arXiv Detail & Related papers (2024-01-11T16:48:44Z) - Stylized Projected GAN: A Novel Architecture for Fast and Realistic
Image Generation [8.796424252434875]
Projected GANs tackle the training difficulty of GANs by using transfer learning to project the generated and real samples into a pre-trained feature space.
integrated modules are incorporated within the generator architecture of the Fast GAN to mitigate the problem of artifacts in the generated images.
arXiv Detail & Related papers (2023-07-30T17:05:22Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z) - StyleSwap: Style-Based Generator Empowers Robust Face Swapping [90.05775519962303]
We introduce a concise and effective framework named StyleSwap.
Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping.
We identify that with only minimal modifications, a StyleGAN2 architecture can successfully handle the desired information from both source and target.
arXiv Detail & Related papers (2022-09-27T16:35:16Z) - StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets [35.11248114153497]
StyleGAN sets new standards for generative modeling regarding image quality and controllability.
Our final model, StyleGAN-XL, sets a new state-of-the-art on large-scale image synthesis and is the first to generate images at a resolution of $10242$ at such a dataset scale.
arXiv Detail & Related papers (2022-02-01T08:22:34Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Global Filter Networks for Image Classification [90.81352483076323]
We present a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
Our results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness.
arXiv Detail & Related papers (2021-07-01T17:58:16Z) - Dynamically Grown Generative Adversarial Networks [111.43128389995341]
We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation.
The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
arXiv Detail & Related papers (2021-06-16T01:25:51Z) - Styleformer: Transformer based Generative Adversarial Networks with
Style Vector [5.025654873456756]
Styleformer is a style-based generator for GAN architecture, but a convolution-free transformer-based generator.
We show how a transformer can generate high-quality images, overcoming the disadvantage that convolution operations are difficult to capture global features in an image.
arXiv Detail & Related papers (2021-06-13T15:30:39Z) - Improving Augmentation and Evaluation Schemes for Semantic Image
Synthesis [16.097324852253912]
We introduce a novel augmentation scheme designed specifically for generative adversarial networks (GANs)
We propose to randomly warp object shapes in the semantic label maps used as an input to the generator.
The local shape discrepancies between the warped and non-warped label maps and images enable the GAN to learn better the structural and geometric details of the scene.
arXiv Detail & Related papers (2020-11-25T10:55:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.