Related papers: Facial Spatiotemporal Graphs: Leveraging the 3D Facial Surface for Remote Physiological Measurement

Facial Spatiotemporal Graphs: Leveraging the 3D Facial Surface for Remote Physiological Measurement

URL: http://arxiv.org/abs/2601.13724v1
Date: Tue, 20 Jan 2026 08:29:44 GMT
Title: Facial Spatiotemporal Graphs: Leveraging the 3D Facial Surface for Remote Physiological Measurement
Authors: Sam Cantrill, David Ahmedt-Aristizabal, Lars Petersson, Hanna Suominen, Mohammad Ali Armin,
Abstract summary: Facial photoplethys (rr) methods estimate physiological signals by modeling subtle color changes on the 3D facial surface over time.<n>Existing methods fail to explicitly align their receptive fields with the 3D facial surface-the spatial support of the r signal.<n>We propose the Facial Stemporal Graph (STGraph), a representation that encodes facial color and using 3D mesh sequences.<n>We introduce MeshPhys, a critical graph convolutional network that operates on the STGraph to estimate signals.
Score: 20.67961570985004
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Facial remote photoplethysmography (rPPG) methods estimate physiological signals by modeling subtle color changes on the 3D facial surface over time. However, existing methods fail to explicitly align their receptive fields with the 3D facial surface-the spatial support of the rPPG signal. To address this, we propose the Facial Spatiotemporal Graph (STGraph), a novel representation that encodes facial color and structure using 3D facial mesh sequences-enabling surface-aligned spatiotemporal processing. We introduce MeshPhys, a lightweight spatiotemporal graph convolutional network that operates on the STGraph to estimate physiological signals. Across four benchmark datasets, MeshPhys achieves state-of-the-art or competitive performance in both intra- and cross-dataset settings. Ablation studies show that constraining the model's receptive field to the facial surface acts as a strong structural prior, and that surface-aligned, 3D-aware node features are critical for robustly encoding facial surface color. Together, the STGraph and MeshPhys constitute a novel, principled modeling paradigm for facial rPPG, enabling robust, interpretable, and generalizable estimation. Code is available at https://samcantrill.github.io/facial-stgraph-rppg/ .

Related papers

Thin-Shell-SfT: Fine-Grained Monocular Non-rigid 3D Surface Tracking with Neural Deformation Fields [66.1612475655465]
3D reconstruction of deformable surfaces from RGB videos is a challenging problem.<n>Existing methods use deformation models with statistical, neural, or physical priors.<n>We propose ThinShell-SfT, a new method for non-rigid 3D tracking meshes.
arXiv Detail & Related papers (2025-03-25T18:00:46Z)
Flatten Anything: Unsupervised Neural Surface Parameterization [76.4422287292541]
We introduce the Flatten Anything Model (FAM), an unsupervised neural architecture to achieve global free-boundary surface parameterization. Compared with previous methods, our FAM directly operates on discrete surface points without utilizing connectivity information. Our FAM is fully-automated without the need for pre-cutting and can deal with highly-complex topologies.
arXiv Detail & Related papers (2024-05-23T14:39:52Z)
Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation [23.199005573530194]
We leverage the 3D facial surface to construct a novel orientation-conditioned video representation. Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD. We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios.
arXiv Detail & Related papers (2024-04-14T23:30:35Z)
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding [106.0876425365599]
Masked Shape Prediction (MSP) is a new framework to conduct masked signal modeling in 3D scenes. MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points.
arXiv Detail & Related papers (2023-05-08T20:09:19Z)
Learning Neural Implicit Representations with Surface Signal Parameterizations [14.835882967340968]
We present a neural network architecture that implicitly encodes the underlying surface parameterization suitable for appearance data. Our model remains compatible with existing mesh-based digital content with appearance data.
arXiv Detail & Related papers (2022-11-01T15:10:58Z)
MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation [69.35523133292389]
We propose a framework that a priori models physical attributes of the face explicitly, thus providing disentanglement by design. Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models. It achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
arXiv Detail & Related papers (2021-11-01T15:53:36Z)
Topologically Consistent Multi-View Face Inference Using Volumetric Sampling [25.001398662643986]
ToFu is a geometry inference framework that can produce topologically consistent meshes across identities and expressions. A novel progressive mesh generation network embeds the topological structure of the face in a feature volume. These high-quality assets are readily usable by production studios for avatar creation, animation and physically-based skin rendering.
arXiv Detail & Related papers (2021-10-06T17:55:08Z)
Face-GCN: A Graph Convolutional Network for 3D Dynamic Face Identification/Recognition [21.116748155592752]
We propose a novel framework for dynamic 3D face identification/recognition based on facial keypoints. Each dynamic sequence of facial expressions is represented as a-temporal graph, which is constructed using 3D facial landmarks. We evaluate our approach on a challenging dynamic 3D facial expression dataset.
arXiv Detail & Related papers (2021-04-19T09:05:39Z)
Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images [64.53227129573293]
We investigate the problem of learning to generate 3D parametric surface representations for novel object instances, as seen from one or more views. We design neural networks capable of generating high-quality parametric 3D surfaces which are consistent between views. Our method is supervised and trained on a public dataset of shapes from common object categories.
arXiv Detail & Related papers (2020-08-18T06:33:40Z)
DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow. Our framework was trained and tested on two very large-scale facial video datasets. Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.