Related papers: Perceptually Validated Precise Local Editing for Facial Action Units with StyleGAN

Perceptually Validated Precise Local Editing for Facial Action Units with StyleGAN

URL: http://arxiv.org/abs/2107.12143v2
Date: Tue, 27 Jul 2021 09:05:22 GMT
Title: Perceptually Validated Precise Local Editing for Facial Action Units with StyleGAN
Authors: Alara Zindanc{\i}o\u{g}lu and T. Metin Sezgin
Abstract summary: We build a solution based on StyleGAN, which has been used extensively for semantic manipulation of faces. We show that a naive strategy to perform editing in the latent space results in undesired coupling between certain action units. We validate the effectiveness of our local editing method through perception experiments conducted with 23 subjects.
Score: 3.8149289266694466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to edit facial expressions has a wide range of applications in computer graphics. The ideal facial expression editing algorithm needs to satisfy two important criteria. First, it should allow precise and targeted editing of individual facial actions. Second, it should generate high fidelity outputs without artifacts. We build a solution based on StyleGAN, which has been used extensively for semantic manipulation of faces. As we do so, we add to our understanding of how various semantic attributes are encoded in StyleGAN. In particular, we show that a naive strategy to perform editing in the latent space results in undesired coupling between certain action units, even if they are conceptually distinct. For example, although brow lowerer and lip tightener are distinct action units, they appear correlated in the training data. Hence, StyleGAN has difficulty in disentangling them. We allow disentangled editing of such action units by computing detached regions of influence for each action unit, and restrict editing to these regions. We validate the effectiveness of our local editing method through perception experiments conducted with 23 subjects. The results show that our method provides higher control over local editing and produces images with superior fidelity compared to the state-of-the-art methods.

Related papers

S$^2$Edit: Text-Guided Image Editing with Precise Semantic and Spatial Control [29.031157601804953]
S$2$Edit is a text-to-image diffusion model that enables personalized editing with precise semantic and spatial control.<n>We show that S$2$Edit performs localized editing while faithfully preserving the original identity with semantically disentangled and spatially focused identity token learned.
arXiv Detail & Related papers (2025-07-07T00:14:08Z)
ZONE: Zero-Shot Instruction-Guided Local Editing [56.56213730578504]
We propose a Zero-shot instructiON-guided local image Editing approach, termed ZONE. We first convert the editing intent from the user-provided instruction into specific image editing regions through InstructPix2Pix. We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.
arXiv Detail & Related papers (2023-12-28T02:54:34Z)
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing [86.92711729969488]
We exploit the amazing capacities of pretrained diffusion models for the editing of images. They either finetune the model, or invert the image in the latent space of the pretrained model. They suffer from two problems: Unsatisfying results for selected regions, and unexpected changes in nonselected regions.
arXiv Detail & Related papers (2023-03-28T00:16:45Z)
Face Attribute Editing with Disentangled Latent Vectors [0.0]
We propose an image-to-image translation framework for facial attribute editing. Inspired by the latent space factorization works of fixed pretrained GANs, we design the attribute editing by latent space factorization. To project images to semantically organized latent spaces, we set an encoder-decoder architecture with attention-based skip connections.
arXiv Detail & Related papers (2023-01-11T18:32:13Z)
Towards Counterfactual Image Manipulation via CLIP [106.94502632502194]
Existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images. We investigate this problem in a text-driven manner with Contrastive-Language-Image-Pretraining (CLIP) We design a novel contrastive loss that exploits predefined CLIP-space directions to guide the editing toward desired directions from different perspectives.
arXiv Detail & Related papers (2022-07-06T17:02:25Z)
Video2StyleGAN: Disentangling Local and Global Variations in a Video [68.70889857355678]
StyleGAN has emerged as a powerful paradigm for facial editing, providing disentangled controls over age, expression, illumination, etc. We introduce Video2StyleGAN that takes a target image and driving video(s) to reenact the local and global locations and expressions from the driving video in the identity of the target image.
arXiv Detail & Related papers (2022-05-27T14:18:19Z)
Expanding the Latent Space of StyleGAN for Real Face Editing [4.1715767752637145]
A surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation. To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables. We present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability.
arXiv Detail & Related papers (2022-04-26T18:27:53Z)
FEAT: Face Editing with Attention [70.89233432407305]
We build on the StyleGAN generator and present a method that explicitly encourages face manipulation to focus on the intended regions. During the generation of the edited image, the attention map serves as a mask that guides a blending between the original features and the modified ones.
arXiv Detail & Related papers (2022-02-06T06:07:34Z)
Talk-to-Edit: Fine-Grained Facial Editing via Dialog [79.8726256912376]
Talk-to-Edit is an interactive facial editing framework that performs fine-grained attribute manipulation through dialog between the user and the system. Our key insight is to model a continual "semantic field" in the GAN latent space. Our system generates language feedback by considering both the user request and the current state of the semantic field.
arXiv Detail & Related papers (2021-09-09T17:17:59Z)
Editing in Style: Uncovering the Local Semantics of GANs [6.342949222955067]
We introduce a simple and effective method for making local, semantically-aware edits to a target output image. This is accomplished by borrowing elements from a source image, also a GAN output, via a novel manipulation of style vectors. We measure the locality and photorealism of the edits produced by our method, and find that it accomplishes both.
arXiv Detail & Related papers (2020-04-29T17:45:56Z)
Toward Fine-grained Facial Expression Manipulation [20.226370494178617]
Previous methods edit an input image under the guidance of a discrete emotion label or absolute condition to possess the desired expression. We replace continuous absolute condition with relative condition, specifically, relative action units. With relative action units, the generator learns to only transform regions of interest which are specified by non-zero-valued relative AUs.
arXiv Detail & Related papers (2020-04-07T05:14:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.