HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
- URL: http://arxiv.org/abs/2111.15666v1
- Date: Tue, 30 Nov 2021 18:56:30 GMT
- Title: HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
- Authors: Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, Amit H. Bermano
- Abstract summary: HyperStyle learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space.
HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders.
- Score: 2.362412515574206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The inversion of real images into StyleGAN's latent space is a well-studied
problem. Nevertheless, applying existing approaches to real-world scenarios
remains an open challenge, due to an inherent trade-off between reconstruction
and editability: latent space regions which can accurately represent real
images typically suffer from degraded semantic control. Recent work proposes to
mitigate this trade-off by fine-tuning the generator to add the target image to
well-behaved, editable regions of the latent space. While promising, this
fine-tuning scheme is impractical for prevalent use as it requires a lengthy
training phase for each new image. In this work, we introduce this approach
into the realm of encoder-based inversion. We propose HyperStyle, a
hypernetwork that learns to modulate StyleGAN's weights to faithfully express a
given image in editable regions of the latent space. A naive modulation
approach would require training a hypernetwork with over three billion
parameters. Through careful network design, we reduce this to be in line with
existing encoders. HyperStyle yields reconstructions comparable to those of
optimization techniques with the near real-time inference capabilities of
encoders. Lastly, we demonstrate HyperStyle's effectiveness on several
applications beyond the inversion task, including the editing of out-of-domain
images which were never seen during training.
Related papers
- Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - HyperPose: Camera Pose Localization using Attention Hypernetworks [6.700873164609009]
We propose the use of attention hypernetworks in camera pose localization.
The proposed approach achieves superior results compared to state-of-the-art methods on contemporary datasets.
arXiv Detail & Related papers (2023-03-05T08:45:50Z) - Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing.
We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights.
Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.