Spatio-Temporal Analysis of Facial Actions using Lifecycle-Aware Capsule
Networks
- URL: http://arxiv.org/abs/2011.08819v2
- Date: Thu, 4 Mar 2021 02:41:43 GMT
- Title: Spatio-Temporal Analysis of Facial Actions using Lifecycle-Aware Capsule
Networks
- Authors: Nikhil Churamani, Sinan Kalkan and Hatice Gunes
- Abstract summary: AULA-Caps learns between contiguous frames by focusing on relevant temporal-temporal segments in the sequence.
The learnt feature capsules are routed together such that the model learns to selectively focus on spatial ortemporal information depending upon the AU lifecycle.
The proposed model is evaluated on the commonly used BP4D and GFT benchmark datasets.
- Score: 12.552355581481994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most state-of-the-art approaches for Facial Action Unit (AU) detection rely
upon evaluating facial expressions from static frames, encoding a snapshot of
heightened facial activity. In real-world interactions, however, facial
expressions are usually more subtle and evolve in a temporal manner requiring
AU detection models to learn spatial as well as temporal information. In this
paper, we focus on both spatial and spatio-temporal features encoding the
temporal evolution of facial AU activation. For this purpose, we propose the
Action Unit Lifecycle-Aware Capsule Network (AULA-Caps) that performs AU
detection using both frame and sequence-level features. While at the
frame-level the capsule layers of AULA-Caps learn spatial feature primitives to
determine AU activations, at the sequence-level, it learns temporal
dependencies between contiguous frames by focusing on relevant spatio-temporal
segments in the sequence. The learnt feature capsules are routed together such
that the model learns to selectively focus more on spatial or spatio-temporal
information depending upon the AU lifecycle. The proposed model is evaluated on
the commonly used BP4D and GFT benchmark datasets obtaining state-of-the-art
results on both the datasets.
Related papers
- Multi-Scale Spatial-Temporal Self-Attention Graph Convolutional Networks for Skeleton-based Action Recognition [0.0]
In this paper, we propose self-attention GCN hybrid model, Multi-Scale Spatial-Temporal self-attention (MSST)-GCN.
We utilize spatial self-attention module with adaptive topology to understand intra-frame interactions within a frame among different body parts, and temporal self-attention module to examine correlations between frames of a node.
arXiv Detail & Related papers (2024-04-03T10:25:45Z) - Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - Spatial-Temporal Attention Network for Open-Set Fine-Grained Image
Recognition [14.450381668547259]
A vision transformer with the spatial self-attention mechanism could not learn accurate attention maps for distinguishing different categories of fine-grained images.
We propose a spatial-temporal attention network for learning fine-grained feature representations, called STAN.
The proposed STAN-OSFGR outperforms 9 state-of-the-art open-set recognition methods significantly in most cases.
arXiv Detail & Related papers (2022-11-25T07:46:42Z) - Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in
the Wild [19.5702895176141]
We propose a method for capturing discnative features within each frame model.
We utilize the CNN to translate each frame into a visual feature sequence.
Experiments indicate that our method provides an effective way to make use of the spatial and temporal dependencies.
arXiv Detail & Related papers (2022-05-10T08:47:15Z) - A Spatio-Temporal Multilayer Perceptron for Gesture Recognition [70.34489104710366]
We propose a multilayer state-weighted perceptron for gesture recognition in the context of autonomous vehicles.
An evaluation of TCG and Drive&Act datasets is provided to showcase the promising performance of our approach.
We deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
arXiv Detail & Related papers (2022-04-25T08:42:47Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations [72.4716073597902]
We propose a method to learn object Canonical Point Cloud Representations of dynamically or moving objects.
We demonstrate the effectiveness of our method on several applications including shape reconstruction, camera pose estimation, continuoustemporal sequence reconstruction, and correspondence estimation.
arXiv Detail & Related papers (2020-08-06T17:58:48Z) - Human Activity Recognition from Wearable Sensor Data Using
Self-Attention [2.9023633922848586]
We present a self-attention based neural network model for activity recognition from body-worn sensor data.
We performed experiments on four popular publicly available HAR datasets: PAMAP2, Opportunity, Skoda and USC-HAD.
Our model achieve significant performance improvement over recent state-of-the-art models in both benchmark test subjects and Leave-one-out-subject evaluation.
arXiv Detail & Related papers (2020-03-17T14:16:57Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.