Related papers: High-Fidelity GAN Inversion for Image Attribute Editing

High-Fidelity GAN Inversion for Image Attribute Editing

URL: http://arxiv.org/abs/2109.06590v2
Date: Wed, 15 Sep 2021 12:07:08 GMT
Title: High-Fidelity GAN Inversion for Image Attribute Editing
Authors: Tengfei Wang, Yong Zhang, Yanbo Fan, Jue Wang, Qifeng Chen
Abstract summary: We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved. To achieve high-fidelity editing, we propose an adaptive distortion alignment (ADA) module with a self-supervised training scheme.
Score: 44.54180180869355
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved (e.g., background, appearance and illumination). We first formulate GAN inversion as a lossy data compression problem and carefully discuss the Rate-Distortion-Edit trade-off. Due to this trade-off, previous works fail to achieve high-fidelity reconstruction while keeping compelling editing ability with a low bit-rate latent code only. In this work, we propose a distortion consultation approach that employs the distortion map as a reference for reconstruction. In the distortion consultation inversion (DCI), the distortion map is first projected to a high-rate latent map, which then complements the basic low-rate latent code with (lost) details via consultation fusion. To achieve high-fidelity editing, we propose an adaptive distortion alignment (ADA) module with a self-supervised training scheme. Extensive experiments in the face and car domains show a clear improvement in terms of both inversion and editing quality.

Related papers

DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing [73.12011187146481]
Inversion within Diffusion models aims to recover the latent noise representation for a real or generated image.<n>Most inversion approaches suffer from an intrinsic trade-off between reconstruction accuracy and editing flexibility.<n>We introduce Dual-Conditional Inversion (DCI), a novel framework that jointly conditions on the source prompt and reference image.
arXiv Detail & Related papers (2025-06-03T07:46:44Z)
Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing [66.48853049746123]
We analyze reconstruction from a structural perspective and propose a novel approach that replaces traditional cross-attention with uniform attention maps. Our method effectively minimizes distortions caused by varying text conditions during noise prediction. Experimental results demonstrate that our approach not only excels in achieving high-fidelity image reconstruction but also performs robustly in real image composition and editing scenarios.
arXiv Detail & Related papers (2024-11-29T12:11:28Z)
Warping the Residuals for Image Editing with StyleGAN [5.733811543584874]
StyleGAN models show editing capabilities via their semantically interpretable latent organizations. Many works have been proposed for inverting images into StyleGAN's latent space. We present a novel image inversion architecture that extracts high-rate latent features and includes a flow estimation module.
arXiv Detail & Related papers (2023-12-18T18:24:18Z)
Spatial-Contextual Discrepancy Information Compensation for GAN Inversion [67.21442893265973]
We introduce a novel spatial-contextual discrepancy information compensationbased GAN-inversion method (SDIC) SDIC bridges the gap in image details between the original image and the reconstructed/edited image. Our proposed method achieves the excellent distortion-editability trade-off at a fast inference speed for both image inversion and editing tasks.
arXiv Detail & Related papers (2023-12-12T08:58:56Z)
In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model. We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z)
Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions. The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result. To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z)
ReGANIE: Rectifying GAN Inversion Errors for Accurate Real Image Editing [20.39792009151017]
StyleGAN allows for flexible and plausible editing of generated images by manipulating the semantic-rich latent style space. Projecting a real image into its latent space encounters an inherent trade-off between inversion quality and editability. We propose a novel two-phase framework by designating two separate networks to tackle editing and reconstruction respectively.
arXiv Detail & Related papers (2023-01-31T04:38:42Z)
StyleRes: Transforming the Residuals for Real Image Editing with StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem. Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge. We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z)
Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability [76.6724135757723]
GAN inversion aims to invert an input image into the latent space of a pre-trained GAN. Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability. We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
arXiv Detail & Related papers (2022-07-19T16:10:16Z)
In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code. Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space. We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.