GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond
- URL: http://arxiv.org/abs/2207.14812v1
- Date: Fri, 29 Jul 2022 17:59:01 GMT
- Title: GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond
- Authors: Kelvin C.K. Chan, Xiangyu Xu, Xintao Wang, Jinwei Gu, Chen Change Loy
- Abstract summary: We show that pre-trained Generative Adversarial Networks (GANs) such as StyleGAN and BigGAN can be used as a latent bank to improve the performance of image super-resolution.
Our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging rich and diverse priors encapsulated in a pre-trained GAN.
We extend our method to different tasks including image colorization and blind image restoration, and extensive experiments show that our proposed models perform favorably in comparison to existing methods.
- Score: 99.6233044915999
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We show that pre-trained Generative Adversarial Networks (GANs) such as
StyleGAN and BigGAN can be used as a latent bank to improve the performance of
image super-resolution. While most existing perceptual-oriented approaches
attempt to generate realistic outputs through learning with adversarial loss,
our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by
directly leveraging rich and diverse priors encapsulated in a pre-trained GAN.
But unlike prevalent GAN inversion methods that require expensive
image-specific optimization at runtime, our approach only needs a single
forward pass for restoration. GLEAN can be easily incorporated in a simple
encoder-bank-decoder architecture with multi-resolution skip connections.
Employing priors from different generative models allows GLEAN to be applied to
diverse categories (\eg~human faces, cats, buildings, and cars). We further
present a lightweight version of GLEAN, named LightGLEAN, which retains only
the critical components in GLEAN. Notably, LightGLEAN consists of only 21% of
parameters and 35% of FLOPs while achieving comparable image quality. We extend
our method to different tasks including image colorization and blind image
restoration, and extensive experiments show that our proposed models perform
favorably in comparison to existing methods. Codes and models are available at
https://github.com/open-mmlab/mmediting.
Related papers
- A Simple Approach to Unifying Diffusion-based Conditional Generation [63.389616350290595]
We introduce a simple, unified framework to handle diverse conditional generation tasks.
Our approach enables versatile capabilities via different inference-time sampling schemes.
Our model supports additional capabilities like non-spatially aligned and coarse conditioning.
arXiv Detail & Related papers (2024-10-15T09:41:43Z) - Towards Generative Class Prompt Learning for Fine-grained Visual Recognition [5.633314115420456]
Generative Class Prompt Learning and Contrastive Multi-class Prompt Learning are presented.
Generative Class Prompt Learning improves visio-linguistic synergy in class embeddings by conditioning on few-shot exemplars with learnable class prompts.
CoMPLe builds on this foundation by introducing a contrastive learning component that encourages inter-class separation.
arXiv Detail & Related papers (2024-09-03T12:34:21Z) - ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge [63.00793292863]
ToddlerDiffusion is a novel approach to decomposing the complex task of RGB image generation into simpler, interpretable stages.
Our method, termed ToddlerDiffusion, cascades modality-specific models, each responsible for generating an intermediate representation.
ToddlerDiffusion consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-11-24T15:20:01Z) - High-Resolution GAN Inversion for Degraded Images in Large Diverse
Datasets [39.21692649763314]
In this paper, we present a novel GAN inversion framework that utilizes the powerful generative ability of StyleGAN-XL.
To ease the inversion challenge with StyleGAN-XL, Clustering & Regularize Inversion (CRI) is proposed.
We validate our CRI scheme on multiple restoration tasks (i.e., inpainting, colorization, and super-resolution) of complex natural images, and show preferable quantitative and qualitative results.
arXiv Detail & Related papers (2023-02-07T11:24:11Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Drop the GAN: In Defense of Patches Nearest Neighbors as Single Image
Generative Models [17.823089978609843]
We show that all of these tasks can be performed without any training, within several seconds, in a unified, surprisingly simple framework.
We start with an initial coarse guess, and then simply refine the details coarse-to-fine using patch-nearest-neighbor search.
This allows generating random novel images better and much faster than GANs.
arXiv Detail & Related papers (2021-03-29T12:20:46Z) - GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution [85.53811497840725]
We show that Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR)
Our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging rich and diverse priors encapsulated in a pre-trained GAN.
Images upscaled by GLEAN show clear improvements in terms of fidelity and texture faithfulness in comparison to existing methods.
arXiv Detail & Related papers (2020-12-01T18:56:14Z) - InfoMax-GAN: Improved Adversarial Image Generation via Information
Maximization and Contrastive Learning [39.316605441868944]
Generative Adversarial Networks (GANs) are fundamental to many generative modelling applications.
We propose a principled framework to simultaneously mitigate two fundamental issues in GANs: catastrophic forgetting of the discriminator and mode collapse of the generator.
Our approach significantly stabilizes GAN training and improves GAN performance for image synthesis across five datasets.
arXiv Detail & Related papers (2020-07-09T06:56:11Z) - The Power of Triply Complementary Priors for Image Compressive Sensing [89.14144796591685]
We propose a joint low-rank deep (LRD) image model, which contains a pair of complementaryly trip priors.
We then propose a novel hybrid plug-and-play framework based on the LRD model for image CS.
To make the optimization tractable, a simple yet effective algorithm is proposed to solve the proposed H-based image CS problem.
arXiv Detail & Related papers (2020-05-16T08:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.