Related papers: Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos

Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos

URL: http://arxiv.org/abs/2508.00748v2
Date: Mon, 04 Aug 2025 12:27:33 GMT
Title: Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos
Authors: Laura Pedrouzo-Rodriguez, Pedro Delgado-DeRobles, Luis F. Gomez, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez,
Abstract summary: An attacker can steal a user's avatar, preserving his appearance and voice, making it nearly impossible to detect its usage by sight or sound alone.<n>Our main question is whether an individual's facial motion patterns can serve as reliable behavioral biometrics to verify their identity when the avatar's visual appearance is a facsimile of its owner.<n> Experimental results demonstrate that facial motion landmarks enable meaningful identity verification with AUC values approaching 80%.
Score: 12.12643642515884
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Photorealistic talking-head avatars are becoming increasingly common in virtual meetings, gaming, and social platforms. These avatars allow for more immersive communication, but they also introduce serious security risks. One emerging threat is impersonation: an attacker can steal a user's avatar, preserving his appearance and voice, making it nearly impossible to detect its fraudulent usage by sight or sound alone. In this paper, we explore the challenge of biometric verification in such avatar-mediated scenarios. Our main question is whether an individual's facial motion patterns can serve as reliable behavioral biometrics to verify their identity when the avatar's visual appearance is a facsimile of its owner. To answer this question, we introduce a new dataset of realistic avatar videos created using a state-of-the-art one-shot avatar generation model, GAGAvatar, with genuine and impostor avatar videos. We also propose a lightweight, explainable spatio-temporal Graph Convolutional Network architecture with temporal attention pooling, that uses only facial landmarks to model dynamic facial gestures. Experimental results demonstrate that facial motion cues enable meaningful identity verification with AUC values approaching 80%. The proposed benchmark and biometric system are available for the research community in order to bring attention to the urgent need for more advanced behavioral biometric defenses in avatar-based communication systems.

Related papers

Towards Privacy-preserving Photorealistic Self-avatars in Mixed Reality [8.591721920594441]
Photorealistic 3D avatar generation has rapidly improved in recent years, and realistic avatars that match a user's true appearance are more feasible in Mixed Reality (MR) than ever before.<n>Yet, there are known risks to sharing one's likeness online, and photorealistic MR avatars could exacerbate these risks.<n>We propose an alternate avatar rendering scheme for broader social MR -- synthesizing realistic avatars that preserve a user's demographic identity while being distinct enough from the individual user to protect facial biometric information.
arXiv Detail & Related papers (2025-07-29T18:37:24Z)
FacialMotionID: Identifying Users of Mixed Reality Headsets using Abstract Facial Motion Representations [2.9136421025415213]
Facial motion capture in mixed reality headsets enables real-time avatar animation, allowing users to convey non-verbal cues during virtual interactions.<n>As facial motion data constitutes a behavioral biometric, its use raises novel privacy concerns.<n>We conducted a study with 116 participants using three types of headsets across three sessions, collecting facial, eye, and head motion data during verbal and non-verbal tasks.<n>Our analysis shows that individuals can be re-identified from this data with up to 98% balanced accuracy, are even identifiable across device types, and that emotional states can be inferred with up to 86% accuracy.
arXiv Detail & Related papers (2025-07-15T09:40:49Z)
SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents [91.26239311240873]
SmartAvatar is a vision-language-agent-driven framework for generating fully rigged, animation-ready 3D human avatars.<n>A key innovation is an autonomous verification loop, where the agent renders draft avatars.<n>The generated avatars are fully rigged and support pose manipulation with consistent identity and appearance.
arXiv Detail & Related papers (2025-06-05T03:49:01Z)
EVA: Expressive Virtual Avatars from Multi-view Videos [51.33851869426057]
We introduce Expressive Virtual Avatars (EVA), an actor-specific, fully controllable, and expressive human avatar framework.<n>EVA achieves high-fidelity, lifelike renderings in real time while enabling independent control of facial expressions, body movements, and hand gestures.<n>This work represents a significant advancement towards fully drivable digital human models.
arXiv Detail & Related papers (2025-05-21T11:22:52Z)
A multidimensional measurement of photorealistic avatar quality of experience [14.94879852506943]
Photo avatars are human avatars that look, move, and talk like real people.<n>We provide an open source test framework to subjectively measure avatar performance in ten dimensions.<n>We find that for avatars above a certain level of realism, eight of these measured dimensions are strongly correlated.
arXiv Detail & Related papers (2024-11-13T22:47:24Z)
Traceable AI-driven Avatars Using Multi-factors of Physical World and Metaverse [7.436039179584676]
Metaverse allows users to delegate their AI models to an AI engine, which builds corresponding AI-driven avatars. In this paper, we propose an authentication method using multi-factors to guarantee the traceability of AI-driven avatars.
arXiv Detail & Related papers (2024-08-30T09:04:11Z)
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos [16.717930760451996]
We term this task avatar fingerprinting. We first introduce a large-scale dataset of real and synthetic videos of people interacting on a video call. We verify the identity driving the expressions in a synthetic video, by learning motion signatures that are independent of the facial appearance shown.
arXiv Detail & Related papers (2023-05-05T17:54:34Z)
OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering [81.55960827071661]
Controllability, generalizability and efficiency are the major objectives of constructing face avatars represented by neural implicit field. We propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution.
arXiv Detail & Related papers (2023-03-26T09:12:03Z)
StylePeople: A Generative Model of Fullbody Human Avatars [59.42166744151461]
We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. We show that such avatars can successfully model clothing and hair, which usually poses a problem for mesh-based approaches. We then propose a generative model for such avatars that can be trained from datasets of images and videos of people.
arXiv Detail & Related papers (2021-04-16T20:43:11Z)
High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation [117.32310997522394]
3D video avatars can empower virtual communications by providing compression, privacy, entertainment, and a sense of presence in AR/VR. Existing person-specific 3D models are not robust to lighting, hence their results typically miss subtle facial behaviors and cause artifacts in the avatar. This paper addresses previous limitations by learning a deep learning lighting model, that in combination with a high-quality 3D face tracking algorithm, provides a method for subtle and robust facial motion transfer from a regular video to a 3D photo-realistic avatar.
arXiv Detail & Related papers (2021-03-29T18:33:49Z)
Towards Face Encryption by Generating Adversarial Identity Masks [53.82211571716117]
We propose a targeted identity-protection iterative method (TIP-IM) to generate adversarial identity masks. TIP-IM provides 95%+ protection success rate against various state-of-the-art face recognition models.
arXiv Detail & Related papers (2020-03-15T12:45:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.