High-Quality Mesh Blendshape Generation from Face Videos via Neural
Inverse Rendering
- URL: http://arxiv.org/abs/2401.08398v1
- Date: Tue, 16 Jan 2024 14:41:31 GMT
- Title: High-Quality Mesh Blendshape Generation from Face Videos via Neural
Inverse Rendering
- Authors: Xin Ming, Jiawei Li, Jingwang Ling, Libo Zhang and Feng Xu
- Abstract summary: We introduce a novel technique that reconstructs mesh-based blendshape rigs from single or sparse multi-view videos.
Experiments demonstrate that, with the flexible input of single or sparse multi-view videos, we reconstruct personalized high-fidelity blendshapes.
- Score: 16.10286515544742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Readily editable mesh blendshapes have been widely used in animation
pipelines, while recent advancements in neural geometry and appearance
representations have enabled high-quality inverse rendering. Building upon
these observations, we introduce a novel technique that reconstructs mesh-based
blendshape rigs from single or sparse multi-view videos, leveraging
state-of-the-art neural inverse rendering. We begin by constructing a
deformation representation that parameterizes vertex displacements into
differential coordinates with tetrahedral connections, allowing for
high-quality vertex deformation on high-resolution meshes. By constructing a
set of semantic regulations in this representation, we achieve joint
optimization of blendshapes and expression coefficients. Furthermore, to enable
a user-friendly multi-view setup with unsynchronized cameras, we propose a
neural regressor to model time-varying motion parameters. This approach
implicitly considers the time difference across multiple cameras, enhancing the
accuracy of motion modeling. Experiments demonstrate that, with the flexible
input of single or sparse multi-view videos, we reconstruct personalized
high-fidelity blendshapes. These blendshapes are both geometrically and
semantically accurate, and they are compatible with industrial animation
pipelines. Code and data will be released.
Related papers
- D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video [53.83936023443193]
This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures.
Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
arXiv Detail & Related papers (2024-06-14T14:35:44Z) - HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis [31.90503003079933]
We introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks.
Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics.
arXiv Detail & Related papers (2024-02-27T09:56:15Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - NeuManifold: Neural Watertight Manifold Reconstruction with Efficient
and High-Quality Rendering Support [45.68296352822415]
We present a method for generating high-quality watertight manifold meshes from multi-view input images.
Our method combines the benefits of both worlds; we take the geometry obtained from neural fields, and further optimize the geometry as well as a compact neural texture representation.
arXiv Detail & Related papers (2023-05-26T17:59:21Z) - BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis [42.93055827628597]
We present a method for reconstructing high-quality meshes of large real-world scenes suitable for photorealistic novel view synthesis.
We first optimize a hybrid neural volume-surface scene representation designed to have well-behaved level sets that correspond to surfaces in the scene.
We then bake this representation into a high-quality triangle mesh, which we equip with a simple and fast view-dependent appearance model based on spherical Gaussians.
arXiv Detail & Related papers (2023-02-28T18:58:03Z) - NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos [82.74918564737591]
We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input.
Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches.
arXiv Detail & Related papers (2022-10-22T04:57:55Z) - Human Performance Modeling and Rendering via Neural Animated Mesh [40.25449482006199]
We bridge the traditional mesh with a new class of neural rendering.
In this paper, we present a novel approach for rendering human views from video.
We demonstrate our approach on various platforms, inserting virtual human performances into AR headsets.
arXiv Detail & Related papers (2022-09-18T03:58:00Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.