Surrogate Gradient Field for Latent Space Manipulation
- URL: http://arxiv.org/abs/2104.09065v2
- Date: Tue, 20 Apr 2021 15:55:27 GMT
- Title: Surrogate Gradient Field for Latent Space Manipulation
- Authors: Minjun Li, Yanghua Jin, Huachun Zhu
- Abstract summary: Generative adversarial networks (GANs) can generate high-quality images from sampled latent codes.
Recent works attempt to edit an image by manipulating its underlying latent code, but rarely go beyond the basic task of attribute adjustment.
We propose the first method that enables manipulation with multidimensional condition such as keypoints and captions.
- Score: 4.880243880711163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative adversarial networks (GANs) can generate high-quality images from
sampled latent codes. Recent works attempt to edit an image by manipulating its
underlying latent code, but rarely go beyond the basic task of attribute
adjustment. We propose the first method that enables manipulation with
multidimensional condition such as keypoints and captions. Specifically, we
design an algorithm that searches for a new latent code that satisfies the
target condition based on the Surrogate Gradient Field (SGF) induced by an
auxiliary mapping network. For quantitative comparison, we propose a metric to
evaluate the disentanglement of manipulation methods. Thorough experimental
analysis on the facial attribute adjustment task shows that our method
outperforms state-of-the-art methods in disentanglement. We further apply our
method to tasks of various condition modalities to demonstrate that our method
can alter complex image properties such as keypoints and captions.
Related papers
- Diffusion Model-Based Image Editing: A Survey [46.244266782108234]
Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks.
We provide an exhaustive overview of existing methods using diffusion models for image editing.
To further evaluate the performance of text-guided image editing algorithms, we propose a systematic benchmark, EditEval.
arXiv Detail & Related papers (2024-02-27T14:07:09Z) - DiffuseGAE: Controllable and High-fidelity Image Manipulation from
Disentangled Representation [14.725538019917625]
Diffusion probabilistic models (DPMs) have shown remarkable results on various image synthesis tasks.
DPMs lack a low-dimensional, interpretable, and well-decoupled latent code.
We propose Diff-AE to explore the potential of DPMs for representation learning via autoencoding.
arXiv Detail & Related papers (2023-07-12T04:11:08Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - ObjectFormer for Image Manipulation Detection and Localization [118.89882740099137]
We propose ObjectFormer to detect and localize image manipulations.
We extract high-frequency features of the images and combine them with RGB features as multimodal patch embeddings.
We conduct extensive experiments on various datasets and the results verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-03-28T12:27:34Z) - Attribute-specific Control Units in StyleGAN for Fine-grained Image
Manipulation [57.99007520795998]
We discover attribute-specific control units, which consist of multiple channels of feature maps and modulation styles.
Specifically, we collaboratively manipulate the modulation style channels and feature maps in control units to obtain the semantic and spatial disentangled controls.
We move the modulation style along a specific sparse direction vector and replace the filter-wise styles used to compute the feature maps to manipulate these control units.
arXiv Detail & Related papers (2021-11-25T10:42:10Z) - Delta-GAN-Encoder: Encoding Semantic Changes for Explicit Image Editing,
using Few Synthetic Samples [2.348633570886661]
We propose a novel method for learning to control any desired attribute in a pre-trained GAN's latent space.
We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits.
arXiv Detail & Related papers (2021-11-16T12:42:04Z) - High Resolution Face Editing with Masked GAN Latent Code Optimization [0.0]
Face editing is a popular research topic in the computer vision community.
Recent proposed methods are based on either training a conditional encoder-decoder Generative Adversarial Network (GAN) in an end-to-end fashion or on defining an operation in the latent space of a pre-trained vanilla GAN generator model.
We propose a GAN embedding optimization procedure with spatial and semantic constraints.
arXiv Detail & Related papers (2021-03-20T08:39:41Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Style Intervention: How to Achieve Spatial Disentanglement with
Style-based Generators? [100.60938767993088]
We propose a lightweight optimization-based algorithm which could adapt to arbitrary input images and render natural translation effects under flexible objectives.
We verify the performance of the proposed framework in facial attribute editing on high-resolution images, where both photo-realism and consistency are required.
arXiv Detail & Related papers (2020-11-19T07:37:31Z) - Semantic Photo Manipulation with a Generative Image Prior [86.01714863596347]
GANs are able to synthesize images conditioned on inputs such as user sketch, text, or semantic labels.
It is hard for GANs to precisely reproduce an input image.
In this paper, we address these issues by adapting the image prior learned by GANs to image statistics of an individual image.
Our method can accurately reconstruct the input image and synthesize new content, consistent with the appearance of the input image.
arXiv Detail & Related papers (2020-05-15T18:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.