Related papers: Improving generative adversarial network inversion via fine-tuning GAN encoders

Improving generative adversarial network inversion via fine-tuning GAN encoders

URL: http://arxiv.org/abs/2108.10201v4
Date: Thu, 12 Dec 2024 12:28:39 GMT
Title: Improving generative adversarial network inversion via fine-tuning GAN encoders
Authors: Cheng Yu, Wenmin Wang, Roberto Bugiolacchi,
Abstract summary: Generative adversarial networks (GANs) can synthesize high-quality (HQ) images.<n>GAN inversion is a technique that discovers how to invert given images back to latent space.<n>We propose a self-supervised method to pre-train and fine-tune GAN encoders.
Score: 16.458842819785822
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Generative adversarial networks (GANs) can synthesize high-quality (HQ) images, and GAN inversion is a technique that discovers how to invert given images back to latent space. While existing methods perform on StyleGAN inversion, they have limited performance and are not generalized to different GANs. To address these issues, we proposed a self-supervised method to pre-train and fine-tune GAN encoders. First, we designed an adaptive block to fit different encoder architectures for inverting diverse GANs. Then we pre-train GAN encoders using synthesized images and emphasize local regions through cropping images. Finally, we fine-tune the pre-trained GAN encoder for inverting real images. Compared with state-of-the-art methods, our method achieved better results that reconstructed high-quality images on mainstream GANs. Our code and pre-trained models are available at: https://github.com/disanda/Deep-GAN-Encoders.

Related papers

In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model. We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z)
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization [73.52943587514386]
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm. We propose a novel two-stage framework: (1) Dynamic-Quantization VAE (DQ-VAE) which encodes image regions into variable-length codes based their information densities for accurate representation.
arXiv Detail & Related papers (2023-05-19T14:56:05Z)
JoIN: Joint GANs Inversion for Intrinsic Image Decomposition [16.02463667910604]
We propose to solve ill-posed inverse imaging problems using a bank of Generative Adversarial Networks (GAN) Our method builds on the demonstrated success of GANs to capture complex image distributions.
arXiv Detail & Related papers (2023-05-18T22:09:32Z)
LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral Image Generation with Variance Regularization [72.4394510913927]
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks. GANs enable diverse augmentation by learning and sampling from the data distribution. GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation. We propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN.
arXiv Detail & Related papers (2023-04-29T00:25:02Z)
TriPlaneNet: An Encoder for EG3D Inversion [1.9567015559455132]
NeRF-based GANs have introduced a number of approaches for high-resolution and high-fidelity generative modeling of human heads. Despite the success of universal optimization-based methods for 2D GAN inversion, those applied to 3D GANs may fail to extrapolate the result onto the novel view. We introduce a fast technique that bridges the gap between the two approaches by directly utilizing the tri-plane representation presented for the EG3D generative model.
arXiv Detail & Related papers (2023-03-23T17:56:20Z)
3D-Aware Encoding for Style-based Neural Radiance Fields [50.118687869198716]
We learn an inversion function to project an input image to the latent space of a NeRF generator and then synthesize novel views of the original image based on the latent code. Compared with GAN inversion for 2D generative models, NeRF inversion not only needs to 1) preserve the identity of the input image, but also 2) ensure 3D consistency in generated novel views. We propose a two-stage encoder for style-based NeRF inversion.
arXiv Detail & Related papers (2022-11-12T06:14:12Z)
Feature-Style Encoder for Style-Based GAN Inversion [1.9116784879310027]
We propose a novel architecture for GAN inversion, which we call Feature-Style encoder. Our model achieves accurate inversion of real images from the latent space of a pre-trained style-based GAN model. Thanks to its encoder structure, the model allows fast and accurate image editing.
arXiv Detail & Related papers (2022-02-04T15:19:34Z)
Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder [75.84152924972462]
Many real-world applications use Siamese networks to efficiently match text sequences at scale. This paper pre-trains language models dedicated to sequence matching in Siamese architectures.
arXiv Detail & Related papers (2021-02-18T08:08:17Z)
GAN Inversion: A Survey [125.62848237531945]
GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model. GAN inversion plays an essential role in enabling the pretrained GAN models such as StyleGAN and BigGAN to be used for real image editing applications.
arXiv Detail & Related papers (2021-01-14T14:11:00Z)
Guiding GANs: How to control non-conditional pre-trained GANs for conditional image generation [69.10717733870575]
We present a novel method for guiding generic non-conditional GANs to behave as conditional GANs. Our approach adds into the mix an encoder network to generate the high-dimensional random input that are fed to the generator network of a non-conditional GAN.
arXiv Detail & Related papers (2021-01-04T14:03:32Z)
GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution [85.53811497840725]
We show that Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR) Our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging rich and diverse priors encapsulated in a pre-trained GAN. Images upscaled by GLEAN show clear improvements in terms of fidelity and texture faithfulness in comparison to existing methods.
arXiv Detail & Related papers (2020-12-01T18:56:14Z)
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation [42.62624182740679]
We present a generic image-to-image translation framework, pixel2style2pixel (pSp) Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator.
arXiv Detail & Related papers (2020-08-03T15:30:38Z)
Adversarial Latent Autoencoders [7.928094304325116]
We introduce an autoencoder that tackles issues jointly, which we call Adversarial Latent Autoencoder (ALAE) ALAE is the first autoencoder able to compare with, and go beyond the capabilities of a generator-only type of architecture.
arXiv Detail & Related papers (2020-04-09T10:33:44Z)
In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code. Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space. We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
Reducing the Representation Error of GAN Image Priors Using the Deep Decoder [29.12824512060469]
We show a method for reducing the representation error of GAN priors by modeling images as the linear combination of a GAN prior and a Deep Decoder. For compressive sensing and image superresolution, our hybrid model exhibits consistently higher PSNRs than both the GAN priors and Deep Decoder separately.
arXiv Detail & Related papers (2020-01-23T18:37:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.