GPAvatar: Generalizable and Precise Head Avatar from Image(s)
- URL: http://arxiv.org/abs/2401.10215v1
- Date: Thu, 18 Jan 2024 18:56:34 GMT
- Title: GPAvatar: Generalizable and Precise Head Avatar from Image(s)
- Authors: Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu,
Tatsuya Harada
- Abstract summary: GPAvatar is a framework that reconstructs 3D head avatars from one or several images in a single forward pass.
The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency.
- Score: 71.555405205039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Head avatar reconstruction, crucial for applications in virtual reality,
online meetings, gaming, and film industries, has garnered substantial
attention within the computer vision community. The fundamental objective of
this field is to faithfully recreate the head avatar and precisely control
expressions and postures. Existing methods, categorized into 2D-based warping,
mesh-based, and neural rendering approaches, present challenges in maintaining
multi-view consistency, incorporating non-facial information, and generalizing
to new identities. In this paper, we propose a framework named GPAvatar that
reconstructs 3D head avatars from one or several images in a single forward
pass. The key idea of this work is to introduce a dynamic point-based
expression field driven by a point cloud to precisely and effectively capture
expressions. Furthermore, we use a Multi Tri-planes Attention (MTA) fusion
module in the tri-planes canonical field to leverage information from multiple
input images. The proposed method achieves faithful identity reconstruction,
precise expression control, and multi-view consistency, demonstrating promising
results for free-viewpoint rendering and novel view synthesis.
Related papers
- HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image [89.70322127648349]
We propose a generic avatar editing approach that can be universally applied to various 3DMM driving volumetric head avatars.
To achieve this goal, we design a novel expression-aware modification generative model, which enables lift 2D editing from a single image to a consistent 3D modification field.
arXiv Detail & Related papers (2024-04-02T17:58:35Z) - InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars [40.10906393484584]
We propose a novel framework that enhances avatar reconstruction performance using an algorithm designed to increase the fidelity from multiple frames.
Our architecture emphasizes pixel-aligned image-to-image translation, mitigating the need to learn correspondences between observation and canonical spaces.
The proposed paradigm demonstrates state-of-the-art performance on one-shot and few-shot avatar animation tasks.
arXiv Detail & Related papers (2023-12-03T18:59:15Z) - NOFA: NeRF-based One-shot Facial Avatar Reconstruction [45.11455702291703]
3D facial avatar reconstruction has been a significant research topic in computer graphics and computer vision.
We propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar.
arXiv Detail & Related papers (2023-07-07T07:58:18Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z) - OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane
Rendering [81.55960827071661]
Controllability, generalizability and efficiency are the major objectives of constructing face avatars represented by neural implicit field.
We propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution.
arXiv Detail & Related papers (2023-03-26T09:12:03Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - PVA: Pixel-aligned Volumetric Avatars [34.929560973779466]
We devise a novel approach for predicting volumetric avatars of the human head given just a small number of inputs.
Our approach is trained in an end-to-end manner solely based on a photometric re-rendering loss without requiring explicit 3D supervision.
arXiv Detail & Related papers (2021-01-07T18:58:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.