Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing
- URL: http://arxiv.org/abs/2206.08357v1
- Date: Thu, 16 Jun 2022 17:57:49 GMT
- Title: Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing
- Authors: Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu,
Krishna Kumar Singh
- Abstract summary: We propose a new method to invert and edit complex images in the latent space of GANs, such as StyleGAN2.
Our key idea is to explore inversion with a collection of layers, spatially adapting the inversion process to the difficulty of the image.
- Score: 57.46189236379433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing GAN inversion and editing methods work well for aligned objects with
a clean background, such as portraits and animal faces, but often struggle for
more difficult categories with complex scene layouts and object occlusions,
such as cars, animals, and outdoor images. We propose a new method to invert
and edit such complex images in the latent space of GANs, such as StyleGAN2.
Our key idea is to explore inversion with a collection of layers, spatially
adapting the inversion process to the difficulty of the image. We learn to
predict the "invertibility" of different image segments and project each
segment into a latent layer. Easier regions can be inverted into an earlier
layer in the generator's latent space, while more challenging regions can be
inverted into a later feature space. Experiments show that our method obtains
better inversion results compared to the recent approaches on complex
categories, while maintaining downstream editability. Please refer to our
project page at https://www.cs.cmu.edu/~SAMInversion.
Related papers
- Move Anything with Layered Scene Diffusion [77.45870343845492]
We propose SceneDiffusion to optimize a layered scene representation during the diffusion sampling process.
Our key insight is that spatial disentanglement can be obtained by jointly denoising scene renderings at different spatial layouts.
Our generated scenes support a wide range of spatial editing operations, including moving, resizing, cloning, and layer-wise appearance editing operations.
arXiv Detail & Related papers (2024-04-10T17:28:16Z) - MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation [54.64194935409982]
We introduce MuLAn: a novel dataset comprising over 44K MUlti-Layer-wise RGBA decompositions.
MuLAn is the first photorealistic resource providing instance decomposition and spatial information for high quality images.
We aim to encourage the development of novel generation and editing technology, in particular layer-wise solutions.
arXiv Detail & Related papers (2024-04-03T14:58:00Z) - Diverse Inpainting and Editing with GAN Inversion [4.234367850767171]
Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space.
In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings.
arXiv Detail & Related papers (2023-07-27T17:41:36Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - Barbershop: GAN-based Image Compositing using Segmentation Masks [40.85660781133709]
We present a novel solution to image blending, particularly for the problem of hairstyle transfer, based on GAN-inversion.
Our results demonstrate a significant improvement over the current state of the art in a user study, with users preferring our blending solution over 95 percent of the time.
arXiv Detail & Related papers (2021-06-02T23:20:43Z) - Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.