ManiCLIP: Multi-Attribute Face Manipulation from Text
- URL: http://arxiv.org/abs/2210.00445v3
- Date: Sun, 26 Mar 2023 01:52:42 GMT
- Title: ManiCLIP: Multi-Attribute Face Manipulation from Text
- Authors: Hao Wang, Guosheng Lin, Ana Garc\'ia del Molino, Anran Wang, Jiashi
Feng, Zhiqi Shen
- Abstract summary: We present a novel multi-attribute face manipulation method based on textual descriptions.
Our method generates natural manipulated faces with minimal text-irrelevant attribute editing.
- Score: 104.30600573306991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we present a novel multi-attribute face manipulation method
based on textual descriptions. Previous text-based image editing methods either
require test-time optimization for each individual image or are restricted to
single attribute editing. Extending these methods to multi-attribute face image
editing scenarios will introduce undesired excessive attribute change, e.g.,
text-relevant attributes are overly manipulated and text-irrelevant attributes
are also changed. In order to address these challenges and achieve natural
editing over multiple face attributes, we propose a new decoupling training
scheme where we use group sampling to get text segments from same attribute
categories, instead of whole complex sentences. Further, to preserve other
existing face attributes, we encourage the model to edit the latent code of
each attribute separately via an entropy constraint. During the inference
phase, our model is able to edit new face images without any test-time
optimization, even from complex textual prompts. We show extensive experiments
and analysis to demonstrate the efficacy of our method, which generates natural
manipulated faces with minimal text-irrelevant attribute editing. Code and
pre-trained model are available at https://github.com/hwang1996/ManiCLIP.
Related papers
- AdapEdit: Spatio-Temporal Guided Adaptive Editing Algorithm for
Text-Based Continuity-Sensitive Image Editing [24.9487669818162]
We propose atemporal guided adaptive editing algorithm AdapEdit, which realizes adaptive image editing.
Our approach has a significant advantage in preserving model priors and does not require model training, fine-tuning extra data, or optimization.
We present our results over a wide variety of raw images and editing instructions, demonstrating competitive performance and showing it significantly outperforms the previous approaches.
arXiv Detail & Related papers (2023-12-13T09:45:58Z) - Localizing and Editing Knowledge in Text-to-Image Generative Models [62.02776252311559]
knowledge about different attributes is not localized in isolated components, but is instead distributed amongst a set of components in the conditional UNet.
We introduce a fast, data-free model editing method Diff-QuickFix which can effectively edit concepts in text-to-image models.
arXiv Detail & Related papers (2023-10-20T17:31:12Z) - DiffEdit: Diffusion-based semantic image editing with mask guidance [64.555930158319]
DiffEdit is a method to take advantage of text-conditioned diffusion models for the task of semantic image editing.
Our main contribution is able to automatically generate a mask highlighting regions of the input image that need to be edited.
arXiv Detail & Related papers (2022-10-20T17:16:37Z) - Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion
Image Manipulation [27.587905673112473]
Fashion attribute editing is a task that aims to convert the semantic attributes of a given fashion image while preserving the irrelevant regions.
Previous works typically employ conditional GANs where the generator explicitly learns the target attributes and directly execute the conversion.
We explore the classifier-guided diffusion that leverages the off-the-shelf diffusion model pretrained on general visual semantics such as Imagenet.
arXiv Detail & Related papers (2022-10-12T02:21:18Z) - Everything is There in Latent Space: Attribute Editing and Attribute
Style Manipulation by StyleGAN Latent Space Exploration [39.18239951479647]
We present Few-shot Latent-based Attribute Manipulation and Editing (FLAME)
FLAME is a framework to perform highly controlled image editing by latent space manipulation.
We generate diverse attribute styles in disentangled manner.
arXiv Detail & Related papers (2022-07-20T12:40:32Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Each Attribute Matters: Contrastive Attention for Sentence-based Image
Editing [13.321782757637303]
Sentence-based Image Editing (SIE) aims to deploy natural language to edit an image.
Existing methods can hardly produce accurate editing when the query sentence is with multiple editable attributes.
This paper proposes a novel model called Contrastive Attention Generative Adversarial Network (CA-GAN)
arXiv Detail & Related papers (2021-10-21T14:06:20Z) - FaceController: Controllable Attribute Editing for Face in the Wild [74.56117807309576]
We propose a simple feed-forward network to generate high-fidelity manipulated faces.
By simply employing some existing and easy-obtainable prior information, our method can control, transfer, and edit diverse attributes of faces in the wild.
In our method, we decouple identity, expression, pose, and illumination using 3D priors; separate texture and colors by using region-wise style codes.
arXiv Detail & Related papers (2021-02-23T02:47:28Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z) - S2FGAN: Semantically Aware Interactive Sketch-to-Face Translation [11.724779328025589]
This paper proposes a sketch-to-image generation framework called S2FGAN.
We employ two latent spaces to control the face appearance and adjust the desired attributes of the generated face.
Our method successfully outperforms state-of-the-art methods on attribute manipulation by exploiting greater control of attribute intensity.
arXiv Detail & Related papers (2020-11-30T13:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.