High-Quality 3D Face Reconstruction with Affine Convolutional Networks
- URL: http://arxiv.org/abs/2310.14237v1
- Date: Sun, 22 Oct 2023 09:04:43 GMT
- Title: High-Quality 3D Face Reconstruction with Affine Convolutional Networks
- Authors: Zhiqian Lin, Jiangke Lin, Lincheng Li, Yi Yuan, Zhengxia Zou
- Abstract summary: In 3D face reconstruction, the spatial misalignment between the input image (e.g. face) and the canonical/UV output makes the feature encoding-decoding process quite challenging.
We propose a new network architecture, namely the Affine Convolution Networks, which enables CNN based approaches to handle spatially non-corresponding input and output images.
Our method is parametric-free and can generate high-quality UV maps at resolution of 512 x 512 pixels, while previous approaches normally generate 256 x 256 pixels or smaller.
- Score: 21.761247036523606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent works based on convolutional encoder-decoder architecture and 3DMM
parameterization have shown great potential for canonical view reconstruction
from a single input image. Conventional CNN architectures benefit from
exploiting the spatial correspondence between the input and output pixels.
However, in 3D face reconstruction, the spatial misalignment between the input
image (e.g. face) and the canonical/UV output makes the feature
encoding-decoding process quite challenging. In this paper, to tackle this
problem, we propose a new network architecture, namely the Affine Convolution
Networks, which enables CNN based approaches to handle spatially
non-corresponding input and output images and maintain high-fidelity quality
output at the same time. In our method, an affine transformation matrix is
learned from the affine convolution layer for each spatial location of the
feature maps. In addition, we represent 3D human heads in UV space with
multiple components, including diffuse maps for texture representation,
position maps for geometry representation, and light maps for recovering more
complex lighting conditions in the real world. All the components can be
trained without any manual annotations. Our method is parametric-free and can
generate high-quality UV maps at resolution of 512 x 512 pixels, while previous
approaches normally generate 256 x 256 pixels or smaller. Our code will be
released once the paper got accepted.
Related papers
- SeqTex: Generate Mesh Textures in Video Sequence [62.766839821764144]
We introduce SeqTex, a novel end-to-end framework for training 3D texture generative models.<n>We show that SeqTex achieves state-of-the-art performance on both image-conditioned and text-conditioned 3D texture generation tasks.
arXiv Detail & Related papers (2025-07-06T07:58:36Z) - Quark: Real-time, High-resolution, and General Neural View Synthesis [14.614589047064191]
We present a novel neural algorithm for performing high-quality, high-resolution, real-time novel view synthesis.
From a sparse set of input RGB images or videos streams, our network both reconstructs the 3D scene and renders novel views at 1080p resolution at 30fps on an NVIDIA A100.
arXiv Detail & Related papers (2024-11-25T18:59:50Z) - HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars.
Our method learns a canonical space via an implicit function parameterized by a neural network.
At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z) - TEGLO: High Fidelity Canonical Texture Mapping from Single-View Images [1.4502611532302039]
We propose TEGLO (Textured EG3D-GLO) for learning 3D representations from single view in-the-wild image collections.
We accomplish this by training a conditional Neural Radiance Field (NeRF) without any explicit 3D supervision.
We demonstrate that such mapping enables texture transfer and texture editing without requiring meshes with shared topology.
arXiv Detail & Related papers (2023-03-24T01:52:03Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction [46.3392612457273]
This dataset contains over 50,000 high-quality texture UV-maps with even illuminations, neutral expressions, and cleaned facial regions.
Our pipeline utilizes the recent advances in StyleGAN-based facial image editing approaches.
Experiments show that our method improves the reconstruction accuracy over state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-25T03:21:05Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z) - Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D
Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs.
We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z) - Learning Deformable Tetrahedral Meshes for 3D Reconstruction [78.0514377738632]
3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics.
Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations.
We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.
arXiv Detail & Related papers (2020-11-03T02:57:01Z) - 3D Human Mesh Regression with Dense Correspondence [95.92326689172877]
Estimating 3D mesh of the human body from a single 2D image is an important task with many applications such as augmented reality and Human-Robot interaction.
Prior works reconstructed 3D mesh from global image feature extracted by using convolutional neural network (CNN), where the dense correspondences between the mesh surface and the image pixels are missing.
This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space.
arXiv Detail & Related papers (2020-06-10T08:50:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.