Warping the Residuals for Image Editing with StyleGAN
- URL: http://arxiv.org/abs/2312.11422v1
- Date: Mon, 18 Dec 2023 18:24:18 GMT
- Title: Warping the Residuals for Image Editing with StyleGAN
- Authors: Ahmet Burak Yildirim, Hamza Pehlivan, Aysegul Dundar
- Abstract summary: StyleGAN models show editing capabilities via their semantically interpretable latent organizations.
Many works have been proposed for inverting images into StyleGAN's latent space.
We present a novel image inversion architecture that extracts high-rate latent features and includes a flow estimation module.
- Score: 5.733811543584874
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: StyleGAN models show editing capabilities via their semantically
interpretable latent organizations which require successful GAN inversion
methods to edit real images. Many works have been proposed for inverting images
into StyleGAN's latent space. However, their results either suffer from low
fidelity to the input image or poor editing qualities, especially for edits
that require large transformations. That is because low-rate latent spaces lose
many image details due to the information bottleneck even though it provides an
editable space. On the other hand, higher-rate latent spaces can pass all the
image details to StyleGAN for perfect reconstruction of images but suffer from
low editing qualities. In this work, we present a novel image inversion
architecture that extracts high-rate latent features and includes a flow
estimation module to warp these features to adapt them to edits. The flows are
estimated from StyleGAN features of edited and unedited latent codes. By
estimating the high-rate features and warping them for edits, we achieve both
high-fidelity to the input image and high-quality edits. We run extensive
experiments and compare our method with state-of-the-art inversion methods.
Qualitative metrics and visual comparisons show significant improvements.
Related papers
- Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing [60.730661748555214]
We introduce textbfTask-textbfOriented textbfDiffusion textbfInversion (textbfTODInv), a novel framework that inverts and edits real images tailored to specific editing tasks.
ToDInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability.
arXiv Detail & Related papers (2024-08-23T22:16:34Z) - The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing [3.58736715327935]
We introduce StyleFeatureEditor, a novel method that enables editing in both w-latents and F-latents.
We also present a new training pipeline specifically designed to train our model to accurately edit F-latents.
Our method is compared with state-of-the-art encoding approaches, demonstrating that our model excels in terms of reconstruction quality.
arXiv Detail & Related papers (2024-06-15T11:28:32Z) - Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types.
By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences.
We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z) - Diverse Inpainting and Editing with GAN Inversion [4.234367850767171]
Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space.
In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings.
arXiv Detail & Related papers (2023-07-27T17:41:36Z) - LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance [0.0]
LEDITS is a combined lightweight approach for real-image editing, incorporating the Edit Friendly DDPM inversion technique with Semantic Guidance.
This approach achieves versatile edits, both subtle and extensive as well as alterations in composition and style, while requiring no optimization nor extensions to the architecture.
arXiv Detail & Related papers (2023-07-02T09:11:09Z) - StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing [86.92711729969488]
We exploit the amazing capacities of pretrained diffusion models for the editing of images.
They either finetune the model, or invert the image in the latent space of the pretrained model.
They suffer from two problems: Unsatisfying results for selected regions, and unexpected changes in nonselected regions.
arXiv Detail & Related papers (2023-03-28T00:16:45Z) - StyleRes: Transforming the Residuals for Real Image Editing with
StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem.
Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge.
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - EditGAN: High-Precision Semantic Image Editing [120.49401527771067]
EditGAN is a novel method for high quality, high precision semantic image editing.
We show that EditGAN can manipulate images with an unprecedented level of detail and freedom.
We can also easily combine multiple edits and perform plausible edits beyond EditGAN training data.
arXiv Detail & Related papers (2021-11-04T22:36:33Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.