Controlling generative models with continuous factors of variations
- URL: http://arxiv.org/abs/2001.10238v1
- Date: Tue, 28 Jan 2020 10:04:04 GMT
- Title: Controlling generative models with continuous factors of variations
- Authors: Antoine Plumerault, Herv\'e Le Borgne, C\'eline Hudelot
- Abstract summary: We introduce a new method to find meaningful directions in the latent space of any generative model.
Our method does not require human annotations and is well suited for the search of directions encoding simple transformations of the generated image.
- Score: 1.7188280334580197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent deep generative models are able to provide photo-realistic images as
well as visual or textual content embeddings useful to address various tasks of
computer vision and natural language processing. Their usefulness is
nevertheless often limited by the lack of control over the generative process
or the poor understanding of the learned representation. To overcome these
major issues, very recent work has shown the interest of studying the semantics
of the latent space of generative models. In this paper, we propose to advance
on the interpretability of the latent space of generative models by introducing
a new method to find meaningful directions in the latent space of any
generative model along which we can move to control precisely specific
properties of the generated image like the position or scale of the object in
the image. Our method does not require human annotations and is particularly
well suited for the search of directions encoding simple transformations of the
generated image, such as translation, zoom or color variations. We demonstrate
the effectiveness of our method qualitatively and quantitatively, both for GANs
and variational auto-encoders.
Related papers
- Active Generation for Image Classification [50.18107721267218]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model.
With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z) - Taming Encoder for Zero Fine-tuning Image Customization with
Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users.
The method is based on a general framework that bypasses the lengthy optimization required by previous approaches.
We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z) - Plug-and-Play Diffusion Features for Text-Driven Image-to-Image
Translation [10.39028769374367]
We present a new framework that takes text-to-image synthesis to the realm of image-to-image translation.
Our method harnesses the power of a pre-trained text-to-image diffusion model to generate a new image that complies with the target text.
arXiv Detail & Related papers (2022-11-22T20:39:18Z) - Exploring the Effectiveness of Mask-Guided Feature Modulation as a
Mechanism for Localized Style Editing of Real Images [33.018300966769516]
We present the SemanticStyle Autoencoder (SSAE), a deep Generative Autoencoder model that leverages semantic mask-guided latent space manipulation.
This work shall serve as a guiding primer for future work.
arXiv Detail & Related papers (2022-11-21T07:36:20Z) - StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [63.85888518950824]
We present a text-driven method that allows shifting a generative model to new domains.
We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains.
arXiv Detail & Related papers (2021-08-02T14:46:46Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Style Intervention: How to Achieve Spatial Disentanglement with
Style-based Generators? [100.60938767993088]
We propose a lightweight optimization-based algorithm which could adapt to arbitrary input images and render natural translation effects under flexible objectives.
We verify the performance of the proposed framework in facial attribute editing on high-resolution images, where both photo-realism and consistency are required.
arXiv Detail & Related papers (2020-11-19T07:37:31Z) - Learning a Deep Reinforcement Learning Policy Over the Latent Space of a
Pre-trained GAN for Semantic Age Manipulation [4.306143768014157]
We learn a conditional policy for semantic manipulation along specific attributes under defined identity bounds.
Results show that our learned policy samples high fidelity images with required age alterations.
arXiv Detail & Related papers (2020-11-02T13:15:18Z) - Generating Annotated High-Fidelity Images Containing Multiple Coherent
Objects [10.783993190686132]
We propose a multi-object generation framework that can synthesize images with multiple objects without explicitly requiring contextual information.
We demonstrate how coherency and fidelity are preserved with our method through experiments on the Multi-MNIST and CLEVR datasets.
arXiv Detail & Related papers (2020-06-22T11:33:55Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.