Related papers: Geometry Driven Progressive Warping for One-Shot Face Animation

Geometry Driven Progressive Warping for One-Shot Face Animation

URL: http://arxiv.org/abs/2210.02391v1
Date: Wed, 5 Oct 2022 17:07:06 GMT
Title: Geometry Driven Progressive Warping for One-Shot Face Animation
Authors: Yatao Zhong, Faezeh Amjadi, Ilya Zharkov
Abstract summary: Face animation aims at creating photo-realistic portrait videos with animated poses and expressions. We present a geometry driven model and propose two geometric patterns as guidance: 3D face rendered displacement maps and posed neural codes. We show that the proposed model can synthesize portrait videos with high fidelity and achieve the new state-of-the-art results on the VoxCeleb1 and VoxCeleb2 datasets.
Score: 5.349852254138086
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Face animation aims at creating photo-realistic portrait videos with animated poses and expressions. A common practice is to generate displacement fields that are used to warp pixels and features from source to target. However, prior attempts often produce sub-optimal displacements. In this work, we present a geometry driven model and propose two geometric patterns as guidance: 3D face rendered displacement maps and posed neural codes. The model can optionally use one of the patterns as guidance for displacement estimation. To model displacements at locations not covered by the face model (e.g., hair), we resort to source image features for contextual information and propose a progressive warping module that alternates between feature warping and displacement estimation at increasing resolutions. We show that the proposed model can synthesize portrait videos with high fidelity and achieve the new state-of-the-art results on the VoxCeleb1 and VoxCeleb2 datasets for both cross identity and same identity reconstruction.

Related papers

MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance [23.69067438843687]
We propose a method for video face reenactment that integrates a 3D face parametric model into a latent diffusion framework. By utilizing the 3D face parametric model as motion guidance, our method enables parametric alignment of face identity between the reference image and the motion captured from the driving video.
arXiv Detail & Related papers (2025-04-30T10:30:46Z)
Text-based Animatable 3D Avatars with Morphable Model Alignment [19.523681764512357]
We propose a novel framework, Anim3D, for text-based realistic animatable 3DGS avatar generation with morphable model alignment. Our method outperforms existing approaches in terms of synthesis quality, alignment, and animation fidelity.
arXiv Detail & Related papers (2025-04-22T12:29:14Z)
DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures. In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process. In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z)
G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation. Our novel approach empowers the face animation model to incorporate 3D information using only 2D images. In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z)
Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos [26.65191922949358]
We present a method to build animatable dog avatars from monocular videos. This is challenging as animals display a range of (unpredictable) non-rigid movements and have a variety of appearance details. We develop an approach that links the video frames via a 4D solution that jointly solves for animal's pose variation, and its appearance.
arXiv Detail & Related papers (2024-03-25T18:41:43Z)
Multiple View Geometry Transformers for 3D Human Pose Estimation [35.26756920323391]
We aim to improve the 3D reasoning ability of Transformers in multi-view 3D human pose estimation. We propose a novel hybrid model, MVGFormer, which has a series of geometric and appearance modules organized in an iterative manner.
arXiv Detail & Related papers (2023-11-18T06:32:40Z)
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing. Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z)
One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field [81.07651217942679]
Talking head generation aims to generate faces that maintain the identity information of the source image and imitate the motion of the driving image. We propose HiDe-NeRF, which achieves high-fidelity and free-view talking-head synthesis.
arXiv Detail & Related papers (2023-04-11T09:47:35Z)
Neural Capture of Animatable 3D Human from Monocular Video [38.974181971541846]
We present a novel paradigm of building an animatable 3D human representation from a monocular video input, such that it can be rendered in any unseen poses and views. Our method is based on a dynamic Neural Radiance Field (NeRF) rigged by a mesh-based parametric 3D human model serving as a geometry proxy.
arXiv Detail & Related papers (2022-08-18T09:20:48Z)
Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses. We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network. Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z)
LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space [90.74976459491303]
We introduce a prior model that is conditioned on the runtime inputs and tie this prior space to the 3D face model via a normalizing flow in the latent space. A normalizing flow bridges the two representation spaces and transforms latent samples from one domain to another, allowing us to define a latent likelihood objective. We show that our approach leads to an expressive and effective prior, capturing facial dynamics and subtle expressions better.
arXiv Detail & Related papers (2022-03-15T13:22:57Z)
Learning an Animatable Detailed 3D Face Model from In-The-Wild Images [50.09971525995828]
We present the first approach to jointly learn a model with animatable detail and a detailed 3D face regressor from in-the-wild images. Our DECA model is trained to robustly produce a UV displacement map from a low-dimensional latent representation. We introduce a novel detail-consistency loss to disentangle person-specific details and expression-dependent wrinkles.
arXiv Detail & Related papers (2020-12-07T19:30:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.