Related papers: Identity-Preserving Pose-Guided Character Animation via Facial Landmarks Transformation

Identity-Preserving Pose-Guided Character Animation via Facial Landmarks Transformation

URL: http://arxiv.org/abs/2412.08976v2
Date: Tue, 18 Mar 2025 08:30:23 GMT
Title: Identity-Preserving Pose-Guided Character Animation via Facial Landmarks Transformation
Authors: Lianrui Mu, Xingze Zhou, Wenjie Zheng, Jiangnan Ye, Haoji Hu,
Abstract summary: We introduce the Facial Landmarks Transformation () method, which leverages a 3D Morphable Model to address this limitation.<n> converts 2D landmarks into a 3D face model, adjusts the 3D face model to align with the reference identity, and then transforms them back into 2D landmarks.<n>This approach ensures accurate alignment with reference facial geometry, enhancing the consistency between generated videos and reference images.
Score: 5.591489936998095
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Creating realistic pose-guided image-to-video character animations while preserving facial identity remains challenging, especially in complex and dynamic scenarios such as dancing, where precise identity consistency is crucial. Existing methods frequently encounter difficulties maintaining facial coherence due to misalignments between facial landmarks extracted from driving videos that provide head pose and expression cues and the facial geometry of the reference images. To address this limitation, we introduce the Facial Landmarks Transformation (FLT) method, which leverages a 3D Morphable Model to address this limitation. FLT converts 2D landmarks into a 3D face model, adjusts the 3D face model to align with the reference identity, and then transforms them back into 2D landmarks to guide the image-to-video generation process. This approach ensures accurate alignment with the reference facial geometry, enhancing the consistency between generated videos and reference images. Experimental results demonstrate that FLT effectively preserves facial identity, significantly improving pose-guided character animation models.

Related papers

MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance [23.69067438843687]
We propose a method for video face reenactment that integrates a 3D face parametric model into a latent diffusion framework. By utilizing the 3D face parametric model as motion guidance, our method enables parametric alignment of face identity between the reference image and the motion captured from the driving video.
arXiv Detail & Related papers (2025-04-30T10:30:46Z)
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation [12.894864326299544]
We present a novel tuning-free IPT2V framework by enhancing face knowledge of the pre-trained video model built on diffusion transformers (DiT) In this work, we present a novel tuning-free IPT2V framework by enhancing face knowledge of the pre-trained video model built on diffusion transformers (DiT)
arXiv Detail & Related papers (2025-02-19T06:50:27Z)
G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation. Our novel approach empowers the face animation model to incorporate 3D information using only 2D images. In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z)
ImFace++: A Sophisticated Nonlinear 3D Morphable Face Model with Implicit Neural Representations [25.016000421755162]
This paper presents a novel 3D morphable face model, named ImFace++, to learn a sophisticated and continuous space with implicit neural representations. ImFace++ first constructs two explicitly disentangled deformation fields to model complex shapes associated with identities and expressions. A refinement displacement field within the template space is further incorporated, enabling fine-grained learning of individual-specific facial details.
arXiv Detail & Related papers (2023-12-07T03:53:53Z)
Identity-Preserving Talking Face Generation with Landmark and Appearance Priors [106.79923577700345]
Existing person-generic methods have difficulty in generating realistic and lip-synced videos. We propose a two-stage framework consisting of audio-to-landmark generation and landmark-to-video rendering procedures. Our method can produce more realistic, lip-synced, and identity-preserving videos than existing person-generic talking face generation methods.
arXiv Detail & Related papers (2023-05-15T01:31:32Z)
Face Transformer: Towards High Fidelity and Accurate Face Swapping [54.737909435708936]
Face swapping aims to generate swapped images that fuse the identity of source faces and the attributes of target faces. This paper presents Face Transformer, a novel face swapping network that can accurately preserve source identities and target attributes simultaneously.
arXiv Detail & Related papers (2023-04-05T15:51:44Z)
HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details [66.74088288846491]
HiFace aims at high-fidelity 3D face reconstruction with dynamic and static details. We exploit several loss functions to jointly learn the coarse shape and fine details with both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-03-20T16:07:02Z)
MorphGANFormer: Transformer-based Face Morphing and De-Morphing [55.211984079735196]
StyleGAN-based approaches to face morphing are among the leading techniques. We propose a transformer-based alternative to face morphing and demonstrate its superiority to StyleGAN-based methods.
arXiv Detail & Related papers (2023-02-18T19:09:11Z)
F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks [24.64246570503213]
We propose a novel representation based on a 3D geometric flow, termed facial flow, to represent the natural motion of the human face at any pose. In order to utilize the facial flow for face editing, we build a framework generating continuous images with conditional facial flows.
arXiv Detail & Related papers (2022-05-12T16:40:27Z)
SAFA: Structure Aware Face Animation [9.58882272014749]
We propose a structure aware face animation (SAFA) method which constructs specific geometric structures to model different components of a face image. We use a 3D morphable model (3DMM) to model the face, multiple affine transforms to model the other foreground components like hair and beard, and an identity transform to model the background. The 3DMM geometric embedding not only helps generate realistic structure for the driving scene, but also contributes to better perception of occluded area in the generated image.
arXiv Detail & Related papers (2021-11-09T03:22:38Z)
MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation [69.35523133292389]
We propose a framework that a priori models physical attributes of the face explicitly, thus providing disentanglement by design. Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models. It achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
arXiv Detail & Related papers (2021-11-01T15:53:36Z)
UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing [78.26925404508994]
We propose a unified temporally consistent facial video editing framework termed UniFaceGAN. Our framework is designed to handle face swapping and face reenactment simultaneously. Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2021-08-12T10:35:22Z)
HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping [116.1022638063613]
We propose HifiFace, which can preserve the face shape of the source face and generate photo-realistic results. We introduce the Semantic Facial Fusion module to optimize the combination of encoder and decoder features.
arXiv Detail & Related papers (2021-06-18T07:39:09Z)
Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions. This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z)
Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images. Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z)
Learning an Animatable Detailed 3D Face Model from In-The-Wild Images [50.09971525995828]
We present the first approach to jointly learn a model with animatable detail and a detailed 3D face regressor from in-the-wild images. Our DECA model is trained to robustly produce a UV displacement map from a low-dimensional latent representation. We introduce a novel detail-consistency loss to disentangle person-specific details and expression-dependent wrinkles.
arXiv Detail & Related papers (2020-12-07T19:30:45Z)
Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting [22.24046752858929]
We propose an end-to-end framework that jointly learns a personalized face model per user and per-frame facial motion parameters. Specifically, we learn user-specific expression blendshapes and dynamic (expression-specific) albedo maps by predicting personalized corrections. Experimental results show that our personalization accurately captures fine-grained facial dynamics in a wide range of conditions.
arXiv Detail & Related papers (2020-07-14T01:30:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.