Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion
and Image Attribute Editing
- URL: http://arxiv.org/abs/2402.14398v1
- Date: Thu, 22 Feb 2024 09:28:47 GMT
- Title: Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion
and Image Attribute Editing
- Authors: Hao Li, Mengqi Huang, Lei Zhang, Bo Hu, Yi Liu, Zhendong Mao
- Abstract summary: GAN-based image editing firstly leverages GAN Inversion to project real images into the latent space of GAN and then manipulates corresponding latent codes.
Recent inversion methods mainly utilize additional high-bit features to improve image details preservation.
During editing, existing works fail to accurately complement the lost details and suffer from poor editability.
- Score: 36.01737879983636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: GAN-based image attribute editing firstly leverages GAN Inversion to project
real images into the latent space of GAN and then manipulates corresponding
latent codes. Recent inversion methods mainly utilize additional high-bit
features to improve image details preservation, as low-bit codes cannot
faithfully reconstruct source images, leading to the loss of details. However,
during editing, existing works fail to accurately complement the lost details
and suffer from poor editability. The main reason is they inject all the lost
details indiscriminately at one time, which inherently induces the position and
quantity of details to overfit source images, resulting in inconsistent content
and artifacts in edited images. This work argues that details should be
gradually injected into both the reconstruction and editing process in a
multi-stage coarse-to-fine manner for better detail preservation and high
editability. Therefore, a novel dual-stream framework is proposed to accurately
complement details at each stage. The Reconstruction Stream is employed to
embed coarse-to-fine lost details into residual features and then adaptively
add them to the GAN generator. In the Editing Stream, residual features are
accurately aligned by our Selective Attention mechanism and then injected into
the editing process in a multi-stage manner. Extensive experiments have shown
the superiority of our framework in both reconstruction accuracy and editing
quality compared with existing methods.
Related papers
- The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing [3.58736715327935]
We introduce StyleFeatureEditor, a novel method that enables editing in both w-latents and F-latents.
We also present a new training pipeline specifically designed to train our model to accurately edit F-latents.
Our method is compared with state-of-the-art encoding approaches, demonstrating that our model excels in terms of reconstruction quality.
arXiv Detail & Related papers (2024-06-15T11:28:32Z) - Warping the Residuals for Image Editing with StyleGAN [5.733811543584874]
StyleGAN models show editing capabilities via their semantically interpretable latent organizations.
Many works have been proposed for inverting images into StyleGAN's latent space.
We present a novel image inversion architecture that extracts high-rate latent features and includes a flow estimation module.
arXiv Detail & Related papers (2023-12-18T18:24:18Z) - Spatial-Contextual Discrepancy Information Compensation for GAN
Inversion [67.21442893265973]
We introduce a novel spatial-contextual discrepancy information compensationbased GAN-inversion method (SDIC)
SDIC bridges the gap in image details between the original image and the reconstructed/edited image.
Our proposed method achieves the excellent distortion-editability trade-off at a fast inference speed for both image inversion and editing tasks.
arXiv Detail & Related papers (2023-12-12T08:58:56Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - StyleRes: Transforming the Residuals for Real Image Editing with
StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem.
Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge.
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z) - Editing Out-of-domain GAN Inversion via Differential Activations [56.62964029959131]
We propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm.
With the aid of the generated Diff-CAM mask, a coarse reconstruction can intuitively be composited by the paired original and edited images.
In the decomposition phase, we further present a GAN prior based deghosting network for separating the final fine edited image from the coarse reconstruction.
arXiv Detail & Related papers (2022-07-17T10:34:58Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.