StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
- URL: http://arxiv.org/abs/2202.00273v1
- Date: Tue, 1 Feb 2022 08:22:34 GMT
- Title: StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
- Authors: Axel Sauer, Katja Schwarz, Andreas Geiger
- Abstract summary: StyleGAN sets new standards for generative modeling regarding image quality and controllability.
Our final model, StyleGAN-XL, sets a new state-of-the-art on large-scale image synthesis and is the first to generate images at a resolution of $10242$ at such a dataset scale.
- Score: 35.11248114153497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer graphics has experienced a recent surge of data-centric approaches
for photorealistic and controllable content creation. StyleGAN in particular
sets new standards for generative modeling regarding image quality and
controllability. However, StyleGAN's performance severely degrades on large
unstructured datasets such as ImageNet. StyleGAN was designed for
controllability; hence, prior works suspect its restrictive design to be
unsuitable for diverse datasets. In contrast, we find the main limiting factor
to be the current training strategy. Following the recently introduced
Projected GAN paradigm, we leverage powerful neural network priors and a
progressive growing strategy to successfully train the latest StyleGAN3
generator on ImageNet. Our final model, StyleGAN-XL, sets a new
state-of-the-art on large-scale image synthesis and is the first to generate
images at a resolution of $1024^2$ at such a dataset scale. We demonstrate that
this model can invert and edit images beyond the narrow domain of portraits or
specific object classes.
Related papers
- GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - Customize StyleGAN with One Hand Sketch [0.0]
We propose a framework to control StyleGAN imagery with a single user sketch.
We learn a conditional distribution in the latent space of a pre-trained StyleGAN model via energy-based learning.
Our model can generate multi-modal images semantically aligned with the input sketch.
arXiv Detail & Related papers (2023-10-29T09:32:33Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - High-Resolution GAN Inversion for Degraded Images in Large Diverse
Datasets [39.21692649763314]
In this paper, we present a novel GAN inversion framework that utilizes the powerful generative ability of StyleGAN-XL.
To ease the inversion challenge with StyleGAN-XL, Clustering & Regularize Inversion (CRI) is proposed.
We validate our CRI scheme on multiple restoration tasks (i.e., inpainting, colorization, and super-resolution) of complex natural images, and show preferable quantitative and qualitative results.
arXiv Detail & Related papers (2023-02-07T11:24:11Z) - Federated Domain Generalization for Image Recognition via Cross-Client
Style Transfer [60.70102634957392]
Domain generalization (DG) has been a hot topic in image recognition, with a goal to train a general model that can perform well on unseen domains.
In this paper, we propose a novel domain generalization method for image recognition through cross-client style transfer (CCST) without exchanging data samples.
Our method outperforms recent SOTA DG methods on two DG benchmarks (PACS, OfficeHome) and a large-scale medical image dataset (Camelyon17) in the FL setting.
arXiv Detail & Related papers (2022-10-03T13:15:55Z) - Self-Distilled StyleGAN: Towards Generation from Internet Photos [47.28014076401117]
We show how StyleGAN can be adapted to work on raw uncurated images collected from the Internet.
We propose a StyleGAN-based self-distillation approach, which consists of two main components.
The presented technique enables the generation of high-quality images, while minimizing the loss in diversity of the data.
arXiv Detail & Related papers (2022-02-24T17:16:47Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - MobileStyleGAN: A Lightweight Convolutional Neural Network for
High-Fidelity Image Synthesis [0.0]
We focus on the performance optimization of style-based generative models.
We introduce MobileStyleGAN architecture, which has x3.5 fewer parameters and is x9.5 less computationally complex than StyleGAN2.
arXiv Detail & Related papers (2021-04-10T13:46:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.