High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering
- URL: http://arxiv.org/abs/2401.08398v2
- Date: Tue, 20 Aug 2024 03:22:34 GMT
- Title: High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering
- Authors: Xin Ming, Jiawei Li, Jingwang Ling, Libo Zhang, Feng Xu,
- Abstract summary: We introduce a novel technique that reconstructs mesh-based blendshape rigs from single or sparse multi-view videos.
Experiments demonstrate that, with the flexible input of single or sparse multi-view videos, we reconstruct personalized high-fidelity blendshapes.
- Score: 15.009484906668737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Readily editable mesh blendshapes have been widely used in animation pipelines, while recent advancements in neural geometry and appearance representations have enabled high-quality inverse rendering. Building upon these observations, we introduce a novel technique that reconstructs mesh-based blendshape rigs from single or sparse multi-view videos, leveraging state-of-the-art neural inverse rendering. We begin by constructing a deformation representation that parameterizes vertex displacements into differential coordinates with tetrahedral connections, allowing for high-quality vertex deformation on high-resolution meshes. By constructing a set of semantic regulations in this representation, we achieve joint optimization of blendshapes and expression coefficients. Furthermore, to enable a user-friendly multi-view setup with unsynchronized cameras, we propose a neural regressor to model time-varying motion parameters. This approach implicitly considers the time difference across multiple cameras, enhancing the accuracy of motion modeling. Experiments demonstrate that, with the flexible input of single or sparse multi-view videos, we reconstruct personalized high-fidelity blendshapes. These blendshapes are both geometrically and semantically accurate, and they are compatible with industrial animation pipelines. Code and data are available at https://github.com/grignarder/high-quality-blendshape-generation.
Related papers
- DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation [10.250715657201363]
We introduce DreamMesh4D, a novel framework combining mesh representation with geometric skinning technique to generate high-quality 4D object from a monocular video.
Our method is compatible with modern graphic pipelines, showcasing its potential in the 3D gaming and film industry.
arXiv Detail & Related papers (2024-10-09T10:41:08Z) - HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [64.48254296523977]
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems.
We introduce FLARE, a technique that enables the creation of animatable and relightable avatars from a single monocular video.
arXiv Detail & Related papers (2023-10-26T16:13:00Z) - NeuManifold: Neural Watertight Manifold Reconstruction with Efficient
and High-Quality Rendering Support [45.68296352822415]
We present a method for generating high-quality watertight manifold meshes from multi-view input images.
Our method combines the benefits of both worlds; we take the geometry obtained from neural fields, and further optimize the geometry as well as a compact neural texture representation.
arXiv Detail & Related papers (2023-05-26T17:59:21Z) - BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis [42.93055827628597]
We present a method for reconstructing high-quality meshes of large real-world scenes suitable for photorealistic novel view synthesis.
We first optimize a hybrid neural volume-surface scene representation designed to have well-behaved level sets that correspond to surfaces in the scene.
We then bake this representation into a high-quality triangle mesh, which we equip with a simple and fast view-dependent appearance model based on spherical Gaussians.
arXiv Detail & Related papers (2023-02-28T18:58:03Z) - NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos [82.74918564737591]
We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input.
Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches.
arXiv Detail & Related papers (2022-10-22T04:57:55Z) - Human Performance Modeling and Rendering via Neural Animated Mesh [40.25449482006199]
We bridge the traditional mesh with a new class of neural rendering.
In this paper, we present a novel approach for rendering human views from video.
We demonstrate our approach on various platforms, inserting virtual human performances into AR headsets.
arXiv Detail & Related papers (2022-09-18T03:58:00Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.