HyperInverter: Improving StyleGAN Inversion via Hypernetwork
- URL: http://arxiv.org/abs/2112.00719v1
- Date: Wed, 1 Dec 2021 18:56:05 GMT
- Title: HyperInverter: Improving StyleGAN Inversion via Hypernetwork
- Authors: Tan M. Dinh, Anh Tuan Tran, Rang Nguyen, Binh-Son Hua
- Abstract summary: Current GAN inversion methods fail to meet at least one of the three requirements listed below: high reconstruction quality, editability, and fast inference.
We present a novel two-phase strategy in this research that fits all requirements at the same time.
Our method is entirely encoder-based, resulting in extremely fast inference.
- Score: 12.173568611144628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-world image manipulation has achieved fantastic progress in recent years
as a result of the exploration and utilization of GAN latent spaces. GAN
inversion is the first step in this pipeline, which aims to map the real image
to the latent code faithfully. Unfortunately, the majority of existing GAN
inversion methods fail to meet at least one of the three requirements listed
below: high reconstruction quality, editability, and fast inference. We present
a novel two-phase strategy in this research that fits all requirements at the
same time. In the first phase, we train an encoder to map the input image to
StyleGAN2 $\mathcal{W}$-space, which was proven to have excellent editability
but lower reconstruction quality. In the second phase, we supplement the
reconstruction ability in the initial phase by leveraging a series of
hypernetworks to recover the missing information during inversion. These two
steps complement each other to yield high reconstruction quality thanks to the
hypernetwork branch and excellent editability due to the inversion done in the
$\mathcal{W}$-space. Our method is entirely encoder-based, resulting in
extremely fast inference. Extensive experiments on two challenging datasets
demonstrate the superiority of our method.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Robust GAN inversion [5.1359892878090845]
We propose an approach which works in native latent space $W$ and tunes the generator network to restore missing image details.
We demonstrate the effectiveness of our approach on two complex datasets: Flickr-Faces-HQ and LSUN Church.
arXiv Detail & Related papers (2023-08-31T07:47:11Z) - Meta-Auxiliary Network for 3D GAN Inversion [18.777352198191004]
In this work, we present a novel meta-auxiliary framework, while leveraging the newly developed 3D GANs as generator.
In the first stage, we invert the input image to an editable latent code using off-the-shelf inversion techniques.
The auxiliary network is proposed to refine the generator parameters with the given image as input, which both predicts offsets for weights of convolutional layers and sampling positions of volume rendering.
In the second stage, we perform meta-learning to fast adapt the auxiliary network to the input image, then the final reconstructed image is synthesized via the meta-learned auxiliary network.
arXiv Detail & Related papers (2023-05-18T11:26:27Z) - CryoFormer: Continuous Heterogeneous Cryo-EM Reconstruction using
Transformer-based Neural Representations [49.49939711956354]
Cryo-electron microscopy (cryo-EM) allows for the high-resolution reconstruction of 3D structures of proteins and other biomolecules.
It is still challenging to reconstruct the continuous motions of 3D structures from noisy and randomly oriented 2D cryo-EM images.
We propose CryoFormer, a new approach for continuous heterogeneous cryo-EM reconstruction.
arXiv Detail & Related papers (2023-03-28T18:59:17Z) - ReGANIE: Rectifying GAN Inversion Errors for Accurate Real Image Editing [20.39792009151017]
StyleGAN allows for flexible and plausible editing of generated images by manipulating the semantic-rich latent style space.
Projecting a real image into its latent space encounters an inherent trade-off between inversion quality and editability.
We propose a novel two-phase framework by designating two separate networks to tackle editing and reconstruction respectively.
arXiv Detail & Related papers (2023-01-31T04:38:42Z) - 3D-Aware Encoding for Style-based Neural Radiance Fields [50.118687869198716]
We learn an inversion function to project an input image to the latent space of a NeRF generator and then synthesize novel views of the original image based on the latent code.
Compared with GAN inversion for 2D generative models, NeRF inversion not only needs to 1) preserve the identity of the input image, but also 2) ensure 3D consistency in generated novel views.
We propose a two-stage encoder for style-based NeRF inversion.
arXiv Detail & Related papers (2022-11-12T06:14:12Z) - LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN
Latent Space [42.56147568941768]
We introduce Normalized Style Space and $mathcalSN$ Cosine Distance to measure disalignment of inversion methods.
Our proposed SNCD is differentiable, it can be optimized in both encoder-based and optimization-based embedding methods to conduct a uniform solution.
arXiv Detail & Related papers (2022-09-26T14:55:21Z) - Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and
Editability [76.6724135757723]
GAN inversion aims to invert an input image into the latent space of a pre-trained GAN.
Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability.
We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
arXiv Detail & Related papers (2022-07-19T16:10:16Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - Over-and-Under Complete Convolutional RNN for MRI Reconstruction [57.95363471940937]
Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture.
We propose an Over-and-Under Complete Convolu?tional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN)
The proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.
arXiv Detail & Related papers (2021-06-16T15:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.