OFER: Occluded Face Expression Reconstruction
- URL: http://arxiv.org/abs/2410.21629v1
- Date: Tue, 29 Oct 2024 00:21:26 GMT
- Title: OFER: Occluded Face Expression Reconstruction
- Authors: Pratheba Selvaraju, Victoria Fernandez Abrevaya, Timo Bolkart, Rick Akkerman, Tianyu Ding, Faezeh Amjadi, Ilya Zharkov,
- Abstract summary: We introduce OFER, a novel approach for single image 3D face reconstruction that can generate plausible, diverse, and expressive 3D faces.
We propose a novel ranking mechanism that sorts the outputs of the shape diffusion network based on the predicted shape accuracy scores to select the best match.
- Score: 16.06622406877353
- License:
- Abstract: Reconstructing 3D face models from a single image is an inherently ill-posed problem, which becomes even more challenging in the presence of occlusions. In addition to fewer available observations, occlusions introduce an extra source of ambiguity, where multiple reconstructions can be equally valid. Despite the ubiquity of the problem, very few methods address its multi-hypothesis nature. In this paper we introduce OFER, a novel approach for single image 3D face reconstruction that can generate plausible, diverse, and expressive 3D faces, even under strong occlusions. Specifically, we train two diffusion models to generate the shape and expression coefficients of a face parametric model, conditioned on the input image. This approach captures the multi-modal nature of the problem, generating a distribution of solutions as output. Although this addresses the ambiguity problem, the challenge remains to pick the best matching shape to ensure consistency across diverse expressions. To achieve this, we propose a novel ranking mechanism that sorts the outputs of the shape diffusion network based on the predicted shape accuracy scores to select the best match. We evaluate our method using standard benchmarks and introduce CO-545, a new protocol and dataset designed to assess the accuracy of expressive faces under occlusion. Our results show improved performance over occlusion-based methods, with added ability to generate multiple expressions for a given image.
Related papers
- Multistable Shape from Shading Emerges from Patch Diffusion [17.090405682103167]
We introduce a model that reconstructs a multimodal distribution of shapes from a single shading image.
We show that multistable shape explanations emerge from this model for ambiguous test images that humans experience as being multistable.
This may inspire new architectures for 3D shape perception that are more efficient and better aligned with human experience.
arXiv Detail & Related papers (2024-05-23T13:15:24Z) - $PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D
Reconstruction [97.06927852165464]
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision.
We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
arXiv Detail & Related papers (2023-02-21T13:37:07Z) - HandFlow: Quantifying View-Dependent 3D Ambiguity in Two-Hand
Reconstruction with Normalizing Flow [73.7895717883622]
We explicitly model the distribution of plausible reconstructions in a conditional normalizing flow framework.
We show that explicit ambiguity modeling is better-suited for this challenging problem.
arXiv Detail & Related papers (2022-10-04T15:42:22Z) - Image Generation with Multimodal Priors using Denoising Diffusion
Probabilistic Models [54.1843419649895]
A major challenge in using generative models to accomplish this task is the lack of paired data containing all modalities and corresponding outputs.
We propose a solution based on a denoising diffusion probabilistic synthesis models to generate images under multi-model priors.
arXiv Detail & Related papers (2022-06-10T12:23:05Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z) - Generating Diverse 3D Reconstructions from a Single Occluded Face Image [18.073864874996534]
We present Diverse3DFace, which is designed to simultaneously generate a diverse and realistic set of 3D reconstructions from a single occluded face image.
On face images occluded by masks, glasses, and other random objects, Diverse3DFace generates a distribution of 3D shapes having 50% higher diversity on the occluded regions compared to the baselines.
arXiv Detail & Related papers (2021-12-01T23:13:49Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Masked Linear Regression for Learning Local Receptive Fields for Facial
Expression Synthesis [10.28711904929932]
We propose a constrained version of ridge regression that exploits the local and sparse structure of facial expressions.
In contrast to the existing approaches, our proposed model can be efficiently trained on larger image sizes.
The proposed algorithm is also compared with state-of-the-art GANs including Pix2Pix, CycleGAN, StarGAN and GANimation.
arXiv Detail & Related papers (2020-11-18T06:04:24Z) - PaMIR: Parametric Model-Conditioned Implicit Representation for
Image-based Human Reconstruction [67.08350202974434]
We propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function.
We show that our method achieves state-of-the-art performance for image-based 3D human reconstruction in the cases of challenging poses and clothing types.
arXiv Detail & Related papers (2020-07-08T02:26:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.