Related papers: A multidimensional measurement of photorealistic avatar quality of experience

A multidimensional measurement of photorealistic avatar quality of experience

URL: http://arxiv.org/abs/2411.09066v1
Date: Wed, 13 Nov 2024 22:47:24 GMT
Title: A multidimensional measurement of photorealistic avatar quality of experience
Authors: Ross Cutler, Babak Naderi, Vishak Gopal, Dharmendar Palle,
Abstract summary: Photorealistic avatars are human avatars that look, move, and talk like real people. We provide an open source test framework to subjectively measure photorealistic avatar performance in ten dimensions. We show that the correlation of nine of these subjective metrics with PSNR, SSIM, LPIPS, FID, and FVD is weak, and moderate for emotion accuracy.
Score: 14.94879852506943
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Photorealistic avatars are human avatars that look, move, and talk like real people. The performance of photorealistic avatars has significantly improved recently based on objective metrics such as PSNR, SSIM, LPIPS, FID, and FVD. However, recent photorealistic avatar publications do not provide subjective tests of the avatars to measure human usability factors. We provide an open source test framework to subjectively measure photorealistic avatar performance in ten dimensions: realism, trust, comfortableness using, comfortableness interacting with, appropriateness for work, creepiness, formality, affinity, resemblance to the person, and emotion accuracy. We show that the correlation of nine of these subjective metrics with PSNR, SSIM, LPIPS, FID, and FVD is weak, and moderate for emotion accuracy. The crowdsourced subjective test framework is highly reproducible and accurate when compared to a panel of experts. We analyze a wide range of avatars from photorealistic to cartoon-like and show that some photorealistic avatars are approaching real video performance based on these dimensions. We also find that for avatars above a certain level of realism, eight of these measured dimensions are strongly correlated. In particular, for photorealistic avatars there is a linear relationship between avatar affinity and realism; in other words, there is no uncanny valley effect for photorealistic avatars in the telecommunication scenario. We provide several extensions of this test framework for future work and discuss design implications for telecommunication systems. The test framework is available at https://github.com/microsoft/P.910.

Related papers

Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos [12.12643642515884]
An attacker can steal a user's avatar, preserving his appearance and voice, making it nearly impossible to detect its usage by sight or sound alone.<n>Our main question is whether an individual's facial motion patterns can serve as reliable behavioral biometrics to verify their identity when the avatar's visual appearance is a facsimile of its owner.<n> Experimental results demonstrate that facial motion landmarks enable meaningful identity verification with AUC values approaching 80%.
arXiv Detail & Related papers (2025-08-01T16:23:27Z)
Towards Privacy-preserving Photorealistic Self-avatars in Mixed Reality [8.591721920594441]
Photorealistic 3D avatar generation has rapidly improved in recent years, and realistic avatars that match a user's true appearance are more feasible in Mixed Reality (MR) than ever before.<n>Yet, there are known risks to sharing one's likeness online, and photorealistic MR avatars could exacerbate these risks.<n>We propose an alternate avatar rendering scheme for broader social MR -- synthesizing realistic avatars that preserve a user's demographic identity while being distinct enough from the individual user to protect facial biometric information.
arXiv Detail & Related papers (2025-07-29T18:37:24Z)
SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents [91.26239311240873]
SmartAvatar is a vision-language-agent-driven framework for generating fully rigged, animation-ready 3D human avatars.<n>A key innovation is an autonomous verification loop, where the agent renders draft avatars.<n>The generated avatars are fully rigged and support pose manipulation with consistent identity and appearance.
arXiv Detail & Related papers (2025-06-05T03:49:01Z)
EVA: Expressive Virtual Avatars from Multi-view Videos [51.33851869426057]
We introduce Expressive Virtual Avatars (EVA), an actor-specific, fully controllable, and expressive human avatar framework.<n>EVA achieves high-fidelity, lifelike renderings in real time while enabling independent control of facial expressions, body movements, and hand gestures.<n>This work represents a significant advancement towards fully drivable digital human models.
arXiv Detail & Related papers (2025-05-21T11:22:52Z)
EgoAvatar: Egocentric View-Driven and Photorealistic Full-body Avatars [56.56236652774294]
We propose a person-specific egocentric telepresence approach, which jointly models the photoreal digital avatar while also driving it from a single egocentric video. Our experiments demonstrate a clear step towards egocentric and photoreal telepresence as our method outperforms baselines as well as competing methods.
arXiv Detail & Related papers (2024-09-22T22:50:27Z)
AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text [71.09533176800707]
AvatarStudio is a coarse-to-fine generative model that generates explicit textured 3D meshes for animatable human avatars. By effectively leveraging the synergy between the articulated mesh representation and the DensePose-conditional diffusion model, AvatarStudio can create high-quality avatars.
arXiv Detail & Related papers (2023-11-29T18:59:32Z)
Physics-based Motion Retargeting from Sparse Inputs [73.94570049637717]
Commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose. We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies. We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available.
arXiv Detail & Related papers (2023-07-04T21:57:05Z)
DreamWaltz: Make a Scene with Complex 3D Animatable Avatars [68.49935994384047]
We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior. For animation, our method learns an animatable 3D avatar representation from abundant image priors of diffusion model conditioned on various poses.
arXiv Detail & Related papers (2023-05-21T17:59:39Z)
The Work Avatar Face-Off: Knowledge Worker Preferences for Realism in Meetings [0.0]
Our survey of 2509 knowledge workers from multiple countries rated five avatar styles for use by managers, known colleagues and unknown colleagues. In all scenarios, participants favored higher realism, but fully realistic avatars were sometimes perceived as uncanny. Less realistic avatars were rated worse when interacting with an unknown colleague or manager, as compared to a known colleague.
arXiv Detail & Related papers (2023-04-03T22:43:20Z)
SwiftAvatar: Efficient Auto-Creation of Parameterized Stylized Character on Arbitrary Avatar Engines [34.645129752596915]
We propose SwiftAvatar, a novel avatar auto-creation framework. We synthesize data in high-quality as many as possible, consisting of avatar vectors and their corresponding realistic faces. Our experiments demonstrate the effectiveness and efficiency of SwiftAvatar on two different avatar engines.
arXiv Detail & Related papers (2023-01-19T16:14:28Z)
AgileAvatar: Stylized 3D Avatar Creation via Cascaded Domain Bridging [12.535634029277212]
We propose a novel self-supervised learning framework to create high-quality stylized 3D avatars. Our results achieve much higher preference scores than previous work and close to those of manual creation.
arXiv Detail & Related papers (2022-11-15T00:43:45Z)
Dressing Avatars: Deep Photorealistic Appearance for Physically Simulated Clothing [49.96406805006839]
We introduce pose-driven avatars with explicit modeling of clothing that exhibit both realistic clothing dynamics and photorealistic appearance learned from real-world data. Our key contribution is a physically-inspired appearance network, capable of generating photorealistic appearance with view-dependent and dynamic shadowing effects even for unseen body-clothing configurations.
arXiv Detail & Related papers (2022-06-30T17:58:20Z)
Robust Egocentric Photo-realistic Facial Expression Transfer for Virtual Reality [68.18446501943585]
Social presence will fuel the next generation of communication systems driven by digital humans in virtual reality (VR) The best 3D video-realistic VR avatars that minimize the uncanny effect rely on person-specific (PS) models. This paper makes progress in overcoming these limitations by proposing an end-to-end multi-identity architecture.
arXiv Detail & Related papers (2021-04-10T15:48:53Z)
High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation [117.32310997522394]
3D video avatars can empower virtual communications by providing compression, privacy, entertainment, and a sense of presence in AR/VR. Existing person-specific 3D models are not robust to lighting, hence their results typically miss subtle facial behaviors and cause artifacts in the avatar. This paper addresses previous limitations by learning a deep learning lighting model, that in combination with a high-quality 3D face tracking algorithm, provides a method for subtle and robust facial motion transfer from a regular video to a 3D photo-realistic avatar.
arXiv Detail & Related papers (2021-03-29T18:33:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.