An Enhanced Adversarial Network with Combined Latent Features for
Spatio-Temporal Facial Affect Estimation in the Wild
- URL: http://arxiv.org/abs/2102.09150v1
- Date: Thu, 18 Feb 2021 04:10:12 GMT
- Title: An Enhanced Adversarial Network with Combined Latent Features for
Spatio-Temporal Facial Affect Estimation in the Wild
- Authors: Decky Aspandi, Federico Sukno, Bj\"orn Schuller and Xavier Binefa
- Abstract summary: This paper proposes a novel model that efficiently extracts both spatial and temporal features of the data by means of its enhanced temporal modelling based on latent features.
Our proposed model consists of three major networks, coined Generator, Discriminator, and Combiner, which are trained in an adversarial setting combined with curriculum learning to enable our adaptive attention modules.
- Score: 1.3007851628964147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Affective Computing has recently attracted the attention of the research
community, due to its numerous applications in diverse areas. In this context,
the emergence of video-based data allows to enrich the widely used spatial
features with the inclusion of temporal information. However, such
spatio-temporal modelling often results in very high-dimensional feature spaces
and large volumes of data, making training difficult and time consuming. This
paper addresses these shortcomings by proposing a novel model that efficiently
extracts both spatial and temporal features of the data by means of its
enhanced temporal modelling based on latent features. Our proposed model
consists of three major networks, coined Generator, Discriminator, and
Combiner, which are trained in an adversarial setting combined with curriculum
learning to enable our adaptive attention modules. In our experiments, we show
the effectiveness of our approach by reporting our competitive results on both
the AFEW-VA and SEWA datasets, suggesting that temporal modelling improves the
affect estimates both in qualitative and quantitative terms. Furthermore, we
find that the inclusion of attention mechanisms leads to the highest accuracy
improvements, as its weights seem to correlate well with the appearance of
facial movements, both in terms of temporal localisation and intensity.
Finally, we observe the sequence length of around 160\,ms to be the optimum one
for temporal modelling, which is consistent with other relevant findings
utilising similar lengths.
Related papers
- Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors.
Existing approaches address these dimensions in isolation, neglecting their critical interdependencies.
In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z) - Graph Masked Autoencoder for Spatio-Temporal Graph Learning [38.085962443141206]
In urban sensing applications, effective-temporal prediction frameworks play a crucial role in traffic analysis, human mobility evaluations and crime prediction.
The presence of data noise and sparsity in spatial and temporal data presents significant challenges for existing neural network models in learning robust representations.
We propose a novel self-supervised learning paradigm for effective-temporal data augmentation.
arXiv Detail & Related papers (2024-10-14T07:33:33Z) - A Survey on Diffusion Models for Time Series and Spatio-Temporal Data [92.1255811066468]
We review the use of diffusion models in time series and S-temporal data, categorizing them by model, task type, data modality, and practical application domain.
We categorize diffusion models into unconditioned and conditioned types discuss time series and S-temporal data separately.
Our survey covers their application extensively in various fields including healthcare, recommendation, climate, energy, audio, and transportation.
arXiv Detail & Related papers (2024-04-29T17:19:40Z) - Spatio-Temporal Attention Graph Neural Network for Remaining Useful Life
Prediction [1.831835396047386]
This study presents the Spatio-Temporal Attention Graph Neural Network.
Our model combines graph neural networks and temporal convolutional neural networks for spatial and temporal feature extraction.
Comprehensive experiments were conducted on the C-MAPSS dataset to evaluate the impact of unified versus clustering normalization.
arXiv Detail & Related papers (2024-01-29T08:49:53Z) - GATGPT: A Pre-trained Large Language Model with Graph Attention Network
for Spatiotemporal Imputation [19.371155159744934]
In real-world settings, such data often contain missing elements due to issues like sensor malfunctions and data transmission errors.
The objective oftemporal imputation is to estimate these missing values by understanding the inherent spatial and temporal relationships in the observed time series.
Traditionally, intricatetemporal imputation has relied on specific architectures, which suffer from limited applicability and high computational complexity.
In contrast our approach integrates pre-trained large language models (LLMs) into intricatetemporal imputation, introducing a groundbreaking framework, GATGPT.
arXiv Detail & Related papers (2023-11-24T08:15:11Z) - Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - An Adaptive Federated Relevance Framework for Spatial Temporal Graph
Learning [14.353798949041698]
We propose an adaptive federated relevance framework, namely FedRel, for spatial-temporal graph learning.
The core Dynamic Inter-Intra Graph (DIIG) module in the framework is able to use these features to generate the spatial-temporal graphs.
To improve the model generalization ability and performance while preserving the local data privacy, we also design a relevance-driven federated learning module.
arXiv Detail & Related papers (2022-06-07T16:12:17Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - Self-Attention Neural Bag-of-Features [103.70855797025689]
We build on the recently introduced 2D-Attention and reformulate the attention learning methodology.
We propose a joint feature-temporal attention mechanism that learns a joint 2D attention mask highlighting relevant information.
arXiv Detail & Related papers (2022-01-26T17:54:14Z) - Temporal Memory Relation Network for Workflow Recognition from Surgical
Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns.
We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.