Everything is There in Latent Space: Attribute Editing and Attribute
Style Manipulation by StyleGAN Latent Space Exploration
- URL: http://arxiv.org/abs/2207.09855v1
- Date: Wed, 20 Jul 2022 12:40:32 GMT
- Title: Everything is There in Latent Space: Attribute Editing and Attribute
Style Manipulation by StyleGAN Latent Space Exploration
- Authors: Rishubh Parihar, Ankit Dhiman, Tejan Karmali and R. Venkatesh Babu
- Abstract summary: We present Few-shot Latent-based Attribute Manipulation and Editing (FLAME)
FLAME is a framework to perform highly controlled image editing by latent space manipulation.
We generate diverse attribute styles in disentangled manner.
- Score: 39.18239951479647
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Unconstrained Image generation with high realism is now possible using recent
Generative Adversarial Networks (GANs). However, it is quite challenging to
generate images with a given set of attributes. Recent methods use style-based
GAN models to perform image editing by leveraging the semantic hierarchy
present in the layers of the generator. We present Few-shot Latent-based
Attribute Manipulation and Editing (FLAME), a simple yet effective framework to
perform highly controlled image editing by latent space manipulation.
Specifically, we estimate linear directions in the latent space (of a
pre-trained StyleGAN) that controls semantic attributes in the generated image.
In contrast to previous methods that either rely on large-scale attribute
labeled datasets or attribute classifiers, FLAME uses minimal supervision of a
few curated image pairs to estimate disentangled edit directions. FLAME can
perform both individual and sequential edits with high precision on a diverse
set of images while preserving identity. Further, we propose a novel task of
Attribute Style Manipulation to generate diverse styles for attributes such as
eyeglass and hair. We first encode a set of synthetic images of the same
identity but having different attribute styles in the latent space to estimate
an attribute style manifold. Sampling a new latent from this manifold will
result in a new attribute style in the generated image. We propose a novel
sampling method to sample latent from the manifold, enabling us to generate a
diverse set of attribute styles beyond the styles present in the training set.
FLAME can generate diverse attribute styles in a disentangled manner. We
illustrate the superior performance of FLAME against previous image editing
methods by extensive qualitative and quantitative comparisons. FLAME also
generalizes well on multiple datasets such as cars and churches.
Related papers
- DiffuseGAE: Controllable and High-fidelity Image Manipulation from
Disentangled Representation [14.725538019917625]
Diffusion probabilistic models (DPMs) have shown remarkable results on various image synthesis tasks.
DPMs lack a low-dimensional, interpretable, and well-decoupled latent code.
We propose Diff-AE to explore the potential of DPMs for representation learning via autoencoding.
arXiv Detail & Related papers (2023-07-12T04:11:08Z) - Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion
Image Manipulation [27.587905673112473]
Fashion attribute editing is a task that aims to convert the semantic attributes of a given fashion image while preserving the irrelevant regions.
Previous works typically employ conditional GANs where the generator explicitly learns the target attributes and directly execute the conversion.
We explore the classifier-guided diffusion that leverages the off-the-shelf diffusion model pretrained on general visual semantics such as Imagenet.
arXiv Detail & Related papers (2022-10-12T02:21:18Z) - ManiCLIP: Multi-Attribute Face Manipulation from Text [104.30600573306991]
We present a novel multi-attribute face manipulation method based on textual descriptions.
Our method generates natural manipulated faces with minimal text-irrelevant attribute editing.
arXiv Detail & Related papers (2022-10-02T07:22:55Z) - Hierarchical Semantic Regularization of Latent Spaces in StyleGANs [53.98170188547775]
We propose a Hierarchical Semantic Regularizer (HSR) which aligns the hierarchical representations learnt by the generator to corresponding powerful features learnt by pretrained networks on large amounts of data.
HSR is shown to not only improve generator representations but also the linearity and smoothness of the latent style spaces, leading to the generation of more natural-looking style-edited images.
arXiv Detail & Related papers (2022-08-07T16:23:33Z) - Attribute Group Editing for Reliable Few-shot Image Generation [85.52840521454411]
We propose a new editing-based method, i.e., Attribute Group Editing (AGE), for few-shot image generation.
AGE examines the internal representation learned in GANs and identifies semantically meaningful directions.
arXiv Detail & Related papers (2022-03-16T06:54:09Z) - Attribute-specific Control Units in StyleGAN for Fine-grained Image
Manipulation [57.99007520795998]
We discover attribute-specific control units, which consist of multiple channels of feature maps and modulation styles.
Specifically, we collaboratively manipulate the modulation style channels and feature maps in control units to obtain the semantic and spatial disentangled controls.
We move the modulation style along a specific sparse direction vector and replace the filter-wise styles used to compute the feature maps to manipulate these control units.
arXiv Detail & Related papers (2021-11-25T10:42:10Z) - DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style
Editing [12.80013698957431]
A Dynamic Style Manipulation Network (DyStyle) is proposed to perform attribute-conditioned style editing.
A novel easy-to-hard training procedure is introduced for efficient and stable training of the DyStyle network.
Our approach demonstrates fine-grained disentangled edits along multiple numeric and binary attributes.
arXiv Detail & Related papers (2021-09-22T13:50:51Z) - SMILE: Semantically-guided Multi-attribute Image and Layout Editing [154.69452301122175]
Attribute image manipulation has been a very active topic since the introduction of Generative Adversarial Networks (GANs)
We present a multimodal representation that handles all attributes, be it guided by random noise or images, while only using the underlying domain information of the target domain.
Our method is capable of adding, removing or changing either fine-grained or coarse attributes by using an image as a reference or by exploring the style distribution space.
arXiv Detail & Related papers (2020-10-05T20:15:21Z) - StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated
Images using Conditional Continuous Normalizing Flows [40.69516201141587]
StyleFlow is an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features.
We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images.
arXiv Detail & Related papers (2020-08-06T00:10:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.