Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation
- URL: http://arxiv.org/abs/2404.09378v3
- Date: Wed, 1 May 2024 05:18:22 GMT
- Title: Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation
- Authors: Sam Cantrill, David Ahmedt-Aristizabal, Lars Petersson, Hanna Suominen, Mohammad Ali Armin,
- Abstract summary: We leverage the 3D facial surface to construct a novel orientation-conditioned video representation.
Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD.
We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios.
- Score: 23.199005573530194
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Camera-based remote photoplethysmography (rPPG) enables contactless measurement of important physiological signals such as pulse rate (PR). However, dynamic and unconstrained subject motion introduces significant variability into the facial appearance in video, confounding the ability of video-based methods to accurately extract the rPPG signal. In this study, we leverage the 3D facial surface to construct a novel orientation-conditioned facial texture video representation which improves the motion robustness of existing video-based facial rPPG estimation methods. Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD over our baseline using the PhysNet model trained on PURE, highlighting the efficacy and generalization benefits of our designed video representation. We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios in cross-dataset testing on MMPD, even in the presence of dynamic and unconstrained subject motion, emphasizing the benefits of disentangling motion through modeling the 3D facial surface for motion robust facial rPPG estimation. We validate the efficacy of our design decisions and the impact of different video processing steps through an ablation study. Our findings illustrate the potential strengths of exploiting the 3D facial surface as a general strategy for addressing dynamic and unconstrained subject motion in videos. The code is available at https://samcantrill.github.io/orientation-uv-rppg/.
Related papers
- G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - Neural Point-based Volumetric Avatar: Surface-guided Neural Points for
Efficient and Photorealistic Volumetric Head Avatar [62.87222308616711]
We propose fullname (name), a method that adopts the neural point representation and the neural volume rendering process.
Specifically, the neural points are strategically constrained around the surface of the target expression via a high-resolution UV displacement map.
By design, our name is better equipped to handle topologically changing regions and thin structures while also ensuring accurate expression control when animating avatars.
arXiv Detail & Related papers (2023-07-11T03:40:10Z) - Motion Matters: Neural Motion Transfer for Better Camera Physiological
Measurement [25.27559386977351]
Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video.
We adapt a neural video synthesis approach to augment videos for the task of remote photoplethys.
We demonstrate a 47% improvement over existing inter-dataset results using various state-of-the-art methods.
arXiv Detail & Related papers (2023-03-21T17:51:23Z) - Pose-Controllable 3D Facial Animation Synthesis using Hierarchical
Audio-Vertex Attention [52.63080543011595]
A novel pose-controllable 3D facial animation synthesis method is proposed by utilizing hierarchical audio-vertex attention.
The proposed method can produce more realistic facial expressions and head posture movements.
arXiv Detail & Related papers (2023-02-24T09:36:31Z) - Video-driven Neural Physically-based Facial Asset for Production [33.24654834163312]
We present a new learning-based, video-driven approach for generating dynamic facial geometries with high-quality physically-based assets.
Our technique provides higher accuracy and visual fidelity than previous video-driven facial reconstruction and animation methods.
arXiv Detail & Related papers (2022-02-11T13:22:48Z) - Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z) - DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow.
Our framework was trained and tested on two very large-scale facial video datasets.
Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z) - Real-time Facial Expression Recognition "In The Wild'' by Disentangling
3D Expression from Identity [6.974241731162878]
This paper proposes a novel method for human emotion recognition from a single RGB image.
We construct a large-scale dataset of facial videos, rich in facial dynamics, identities, expressions, appearance and 3D pose variations.
Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation.
arXiv Detail & Related papers (2020-05-12T01:32:55Z) - It's Written All Over Your Face: Full-Face Appearance-Based Gaze
Estimation [82.16380486281108]
We propose an appearance-based method that only takes the full face image as input.
Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps.
We show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation.
arXiv Detail & Related papers (2016-11-27T15:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.