Related papers: xTrace: A Facial Expressive Behaviour Analysis Tool for Continuous Affect Recognition

xTrace: A Facial Expressive Behaviour Analysis Tool for Continuous Affect Recognition

URL: http://arxiv.org/abs/2505.05043v1
Date: Thu, 08 May 2025 08:27:37 GMT
Title: xTrace: A Facial Expressive Behaviour Analysis Tool for Continuous Affect Recognition
Authors: Mani Kumar Tellamekala, Shashank Jaiswal, Thomas Smith, Timur Alamev, Gary McKeown, Anthony Brown, Michel Valstar,
Abstract summary: Recognising expressive behaviours in face videos is a long-standing challenge in Affective Computing.<n>This paper addresses two key challenges in building a system for naturalistic and in-the-wild facial expressive behaviour analysis.<n>We introduce xTrace, a robust tool for facial expressive behaviour analysis.
Score: 1.8804634388685453
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recognising expressive behaviours in face videos is a long-standing challenge in Affective Computing. Despite significant advancements in recent years, it still remains a challenge to build a robust and reliable system for naturalistic and in-the-wild facial expressive behaviour analysis in real time. This paper addresses two key challenges in building such a system: (1). The paucity of large-scale labelled facial affect video datasets with extensive coverage of the 2D emotion space, and (2). The difficulty of extracting facial video features that are discriminative, interpretable, robust, and computationally efficient. Toward addressing these challenges, we introduce xTrace, a robust tool for facial expressive behaviour analysis and predicting continuous values of dimensional emotions, namely valence and arousal, from in-the-wild face videos. To address challenge (1), our affect recognition model is trained on the largest facial affect video data set, containing ~450k videos that cover most emotion zones in the dimensional emotion space, making xTrace highly versatile in analysing a wide spectrum of naturalistic expressive behaviours. To address challenge (2), xTrace uses facial affect descriptors that are not only explainable, but can also achieve a high degree of accuracy and robustness with low computational complexity. The key components of xTrace are benchmarked against three existing tools: MediaPipe, OpenFace, and Augsburg Affect Toolbox. On an in-the-wild validation set composed of 50k videos, xTrace achieves 0.86 mean CCC and 0.13 mean absolute error values. We present a detailed error analysis of affect predictions from xTrace, illustrating (a). its ability to recognise emotions with high accuracy across most bins in the 2D emotion space, (b). its robustness to non-frontal head pose angles, and (c). a strong correlation between its uncertainty estimates and its accuracy.

Related papers

Facial Landmark Visualization and Emotion Recognition Through Neural Networks [0.0]
Emotion recognition from facial images is a crucial task in human-computer interaction.<n>Previous studies have shown that facial images can be used to train deep learning models.<n>We propose facial landmark box plots, a visualization technique designed to identify outliers in facial datasets.
arXiv Detail & Related papers (2025-06-20T17:45:34Z)
V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition [9.57248169951292]
8th Affective Behavior Analysis in-the-Wild (ABAW) Challenge aims to assess human emotions using the video-based Aff-Wild2 dataset.<n>This challenge includes various tasks, including the video-based EXPR recognition track, which is our primary focus.<n>We propose Video-based Noise-aware Adaptive Weighting (V-NAW), which adaptively assigns importance to each frame in a clip to address label ambiguity and effectively capture temporal variations in facial expressions.
arXiv Detail & Related papers (2025-03-20T09:13:34Z)
Understanding Long Videos via LLM-Powered Entity Relation Graphs [51.13422967711056]
GraphVideoAgent is a framework that maps and monitors the evolving relationships between visual entities throughout the video sequence.<n>Our approach demonstrates remarkable effectiveness when tested against industry benchmarks.
arXiv Detail & Related papers (2025-01-27T10:57:24Z)
FaceXFormer: A Unified Transformer for Facial Analysis [59.94066615853198]
FaceXFormer is an end-to-end unified transformer model capable of performing ten facial analysis tasks.<n>Tasks include face parsing, landmark detection, head pose estimation, attribute prediction, age, gender, and race estimation.<n>We train FaceXFormer on ten diverse face perception datasets and evaluate it against both specialized and multi-task models.
arXiv Detail & Related papers (2024-03-19T17:58:04Z)
Affective Behaviour Analysis via Integrating Multi-Modal Knowledge [24.74463315135503]
The 6th competition on Affective Behavior Analysis in-the-wild (ABAW) utilizes the Aff-Wild2, Hume-Vidmimic2, and C-EXPR-DB datasets. We present our method designs for the five competitive tracks, i.e., Valence-Arousal (VA) Estimation, Expression (EXPR) Recognition, Action Unit (AU) Detection, Compound Expression (CE) Recognition, and Emotional Mimicry Intensity (EMI) Estimation.
arXiv Detail & Related papers (2024-03-16T06:26:43Z)
Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection [112.96004727646115]
We develop a method to detect face-manipulated videos using real talking faces. We show that our method achieves state-of-the-art performance on cross-manipulation generalisation and robustness experiments. Our results suggest that leveraging natural and unlabelled videos is a promising direction for the development of more robust face forgery detectors.
arXiv Detail & Related papers (2022-01-18T17:14:54Z)
Pre-training strategies and datasets for facial representation learning [58.8289362536262]
We show how to find a universal face representation that can be adapted to several facial analysis tasks and datasets. We systematically investigate two ways of large-scale representation learning applied to faces: supervised and unsupervised pre-training. Our main two findings are: Unsupervised pre-training on completely in-the-wild, uncurated data provides consistent and, in some cases, significant accuracy improvements.
arXiv Detail & Related papers (2021-03-30T17:57:25Z)
Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality [84.69595956853908]
We present Affect2MM, a learning method for time-series emotion prediction for multimedia content. Our goal is to automatically capture the varying emotions depicted by characters in real-life human-centric situations and behaviors.
arXiv Detail & Related papers (2021-03-11T09:07:25Z)
Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild. We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z)
Real-time Facial Expression Recognition "In The Wild'' by Disentangling 3D Expression from Identity [6.974241731162878]
This paper proposes a novel method for human emotion recognition from a single RGB image. We construct a large-scale dataset of facial videos, rich in facial dynamics, identities, expressions, appearance and 3D pose variations. Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation.
arXiv Detail & Related papers (2020-05-12T01:32:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.