From Continuity to Editability: Inverting GANs with Consecutive Images
- URL: http://arxiv.org/abs/2107.13812v2
- Date: Fri, 30 Jul 2021 02:58:33 GMT
- Title: From Continuity to Editability: Inverting GANs with Consecutive Images
- Authors: Yangyang Xu, Yong Du, Wenpeng Xiao, Xuemiao Xu and Shengfeng He
- Abstract summary: Existing GAN inversion methods are stuck in a paradox that the inverted codes can either achieve high-fidelity reconstruction, or retain the editing capability.
In this paper, we resolve this paradox by introducing consecutive images into the inversion process.
Our method provides the first support of video-based GAN inversion, and an interesting application of unsupervised semantic transfer from consecutive images.
- Score: 37.16137384683823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing GAN inversion methods are stuck in a paradox that the inverted codes
can either achieve high-fidelity reconstruction, or retain the editing
capability. Having only one of them clearly cannot realize real image editing.
In this paper, we resolve this paradox by introducing consecutive images (\eg,
video frames or the same person with different poses) into the inversion
process. The rationale behind our solution is that the continuity of
consecutive images leads to inherent editable directions. This inborn property
is used for two unique purposes: 1) regularizing the joint inversion process,
such that each of the inverted code is semantically accessible from one of the
other and fastened in a editable domain; 2) enforcing inter-image coherence,
such that the fidelity of each inverted code can be maximized with the
complement of other images. Extensive experiments demonstrate that our
alternative significantly outperforms state-of-the-art methods in terms of
reconstruction fidelity and editability on both the real image dataset and
synthesis dataset. Furthermore, our method provides the first support of
video-based GAN inversion, and an interesting application of unsupervised
semantic transfer from consecutive images. Source code can be found at:
\url{https://github.com/cnnlstm/InvertingGANs_with_ConsecutiveImgs}.
Related papers
- Doubly Abductive Counterfactual Inference for Text-based Image Editing [130.46583155383735]
We study text-based image editing (TBIE) of a single image by counterfactual inference.
We propose a Doubly Abductive Counterfactual inference framework (DAC)
Our DAC achieves a good trade-off between editability and fidelity.
arXiv Detail & Related papers (2024-03-05T13:59:21Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - RIGID: Recurrent GAN Inversion and Editing of Real Face Videos [73.97520691413006]
GAN inversion is indispensable for applying the powerful editability of GAN to real images.
Existing methods invert video frames individually often leading to undesired inconsistent results over time.
We propose a unified recurrent framework, named textbfRecurrent vtextbfIdeo textbfGAN textbfInversion and etextbfDiting (RIGID)
Our framework learns the inherent coherence between input frames in an end-to-end manner.
arXiv Detail & Related papers (2023-08-11T12:17:24Z) - In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing [28.790900756506833]
3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts.
GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code.
We address this issue by explicitly modeling OOD objects from the input in 3D-aware GANs.
arXiv Detail & Related papers (2023-02-09T18:59:56Z) - Eliminating Contextual Prior Bias for Semantic Image Editing via
Dual-Cycle Diffusion [35.95513392917737]
A novel approach called Dual-Cycle Diffusion generates an unbiased mask to guide image editing.
Our experiments demonstrate the effectiveness of the proposed method, as it significantly improves the D-CLIP score from 0.272 to 0.283.
arXiv Detail & Related papers (2023-02-05T14:30:22Z) - StyleRes: Transforming the Residuals for Real Image Editing with
StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem.
Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge.
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.