Related papers: Neural Face Skinning for Mesh-agnostic Facial Expression Cloning

Neural Face Skinning for Mesh-agnostic Facial Expression Cloning

URL: http://arxiv.org/abs/2505.22416v1
Date: Wed, 28 May 2025 14:43:43 GMT
Title: Neural Face Skinning for Mesh-agnostic Facial Expression Cloning
Authors: Sihun Cha, Serin Yoon, Kwanggyoon Seo, Junyong Noh,
Abstract summary: We propose a method that combines the strengths of both global and local deformation models.<n>Our approach enables intuitive control and detailed expression cloning across diverse face meshes.<n>We demonstrate improved performance over state-of-the-art methods in terms of expression fidelity, deformation transfer accuracy, and adaptability across diverse mesh structures.
Score: 5.819784482811377
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurately retargeting facial expressions to a face mesh while enabling manipulation is a key challenge in facial animation retargeting. Recent deep-learning methods address this by encoding facial expressions into a global latent code, but they often fail to capture fine-grained details in local regions. While some methods improve local accuracy by transferring deformations locally, this often complicates overall control of the facial expression. To address this, we propose a method that combines the strengths of both global and local deformation models. Our approach enables intuitive control and detailed expression cloning across diverse face meshes, regardless of their underlying structures. The core idea is to localize the influence of the global latent code on the target mesh. Our model learns to predict skinning weights for each vertex of the target face mesh through indirect supervision from predefined segmentation labels. These predicted weights localize the global latent code, enabling precise and region-specific deformations even for meshes with unseen shapes. We supervise the latent code using Facial Action Coding System (FACS)-based blendshapes to ensure interpretability and allow straightforward editing of the generated animation. Through extensive experiments, we demonstrate improved performance over state-of-the-art methods in terms of expression fidelity, deformation transfer accuracy, and adaptability across diverse mesh structures.

Related papers

MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection [64.29452783056253]
The rapid development of photo-realistic face generation methods has raised significant concerns in society and academia.<n>Although existing approaches mainly capture face forgery patterns using image modality, other modalities like fine-grained noises and texts are not fully explored.<n>We propose a novel multi-modal fine-grained CLIP (MFCLIP) model, which mines comprehensive and fine-grained forgery traces across image-noise modalities.
arXiv Detail & Related papers (2024-09-15T13:08:59Z)
HeadEvolver: Text to Head Avatars via Expressive and Attribute-Preserving Mesh Deformation [17.590555698266346]
This paper introduces a novel framework for generating stylized head avatars from text guidance.<n>Our method represents mesh deformation with per-face Jacobians and adaptively modulates local deformation using a learnable vector field.<n>Our framework can generate realistic shapes and textures that can be further edited via text, while supporting seamless editing using the preserved attributes from the template mesh.
arXiv Detail & Related papers (2024-03-14T12:15:23Z)
Learning Position-Aware Implicit Neural Network for Real-World Face Inpainting [55.87303287274932]
Face inpainting requires the model to have a precise global understanding of the facial position structure. In this paper, we propose an textbfImplicit textbfNeural textbfInpainting textbfNetwork (IN$2$) to handle arbitrary-shape face images in real-world scenarios.
arXiv Detail & Related papers (2024-01-19T07:31:44Z)
BlendFields: Few-Shot Example-Driven Facial Modeling [35.86727715239676]
We introduce a method that bridges the gap by drawing inspiration from traditional computer graphics techniques. Unseen expressions are modeled by blending appearance from a sparse set of extreme poses. We show that our method generalizes to unseen expressions, adding fine-grained effects on top of smooth volumetric deformations of a face, and demonstrate how it generalizes beyond faces.
arXiv Detail & Related papers (2023-05-12T14:30:07Z)
LC-NeRF: Local Controllable Face Generation in Neural Randiance Field [55.54131820411912]
LC-NeRF is composed of a Local Region Generators Module and a Spatial-Aware Fusion Module. Our method provides better local editing than state-of-the-art face editing methods. Our method also performs well in downstream tasks, such as text-driven facial image editing.
arXiv Detail & Related papers (2023-02-19T05:50:08Z)
Exploiting Shape Cues for Weakly Supervised Semantic Segmentation [15.791415215216029]
Weakly supervised semantic segmentation (WSSS) aims to produce pixel-wise class predictions with only image-level labels for training. We propose to exploit shape information to supplement the texture-biased property of convolutional neural networks (CNNs) We further refine the predictions in an online fashion with a novel refinement method that takes into account both the class and the color affinities.
arXiv Detail & Related papers (2022-08-08T17:25:31Z)
FEAT: Face Editing with Attention [70.89233432407305]
We build on the StyleGAN generator and present a method that explicitly encourages face manipulation to focus on the intended regions. During the generation of the edited image, the attention map serves as a mask that guides a blending between the original features and the modified ones.
arXiv Detail & Related papers (2022-02-06T06:07:34Z)
TANet: A new Paradigm for Global Face Super-resolution via Transformer-CNN Aggregation Network [72.41798177302175]
We propose a novel paradigm based on the self-attention mechanism (i.e., the core of Transformer) to fully explore the representation capacity of the facial structure feature. Specifically, we design a Transformer-CNN aggregation network (TANet) consisting of two paths, in which one path uses CNNs responsible for restoring fine-grained facial details. By aggregating the features from the above two paths, the consistency of global facial structure and fidelity of local facial detail restoration are strengthened simultaneously.
arXiv Detail & Related papers (2021-09-16T18:15:07Z)
Synthesizing Human Faces using Latent Space Factorization and Local Weights (Extended Version) [24.888957468547744]
The proposed model allows partial manipulation of the face while still learning the whole face mesh. We factorize the latent space of the whole face to the subspace indicating different parts of the face.
arXiv Detail & Related papers (2021-07-19T10:17:30Z)
Face Sketch Synthesis via Semantic-Driven Generative Adversarial Network [10.226808267718523]
We propose a novel Semantic-Driven Generative Adrial Network (SDGAN) which embeds global structure-level style injection and local class-level knowledge re-weighting. Specifically, we conduct facial saliency detection on the input face photos to provide overall facial texture structure. In addition, we exploit face parsing layouts as the semantic-level spatial prior to enforce globally structural style injection in the generator of SDGAN.
arXiv Detail & Related papers (2021-06-29T07:03:56Z)
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models. We first find that GANs learn various semantics in some linear subspaces of the latent space. We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.