Related papers: Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition

Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition

URL: http://arxiv.org/abs/2112.05851v2
Date: Tue, 14 Dec 2021 13:26:25 GMT
Title: Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition
Authors: Liangfei Zhang, Xiaopeng Hong, Ognjen Arandjelovic, Guoying Zhao
Abstract summary: We propose a novel a-temporal transformer architecture -- to the best of our knowledge, the first purely transformer based approach for micro-expression recognition. The architecture comprises a spatial encoder which learns spatial patterns, a temporal dimension classification for temporal analysis, and a head. A comprehensive evaluation on three widely used spontaneous micro-expression data sets, shows that the proposed approach consistently outperforms the state of the art.
Score: 61.374467942519374
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Being spontaneous, micro-expressions are useful in the inference of a person's true emotions even if an attempt is made to conceal them. Due to their short duration and low intensity, the recognition of micro-expressions is a difficult task in affective computing. The early work based on handcrafted spatio-temporal features which showed some promise, has recently been superseded by different deep learning approaches which now compete for the state of the art performance. Nevertheless, the problem of capturing both local and global spatio-temporal patterns remains challenging. To this end, herein we propose a novel spatio-temporal transformer architecture -- to the best of our knowledge, the first purely transformer based approach (i.e. void of any convolutional network use) for micro-expression recognition. The architecture comprises a spatial encoder which learns spatial patterns, a temporal aggregator for temporal dimension analysis, and a classification head. A comprehensive evaluation on three widely used spontaneous micro-expression data sets, namely SMIC-HS, CASME II and SAMM, shows that the proposed approach consistently outperforms the state of the art, and is the first framework in the published literature on micro-expression recognition to achieve the unweighted F1-score greater than 0.9 on any of the aforementioned data sets.

Related papers

Temporal and Spatial Feature Fusion Framework for Dynamic Micro Expression Recognition [5.444324424467006]
Transient and highly localised micro-expressions pose a significant challenge to their accurate recognition.<n>The accuracy rate of micro-expression recognition is as low as 50%, even for professionals.<n>We propose a novel Temporal and Spatial feature Fusion framework for DMER (TSFmicro)
arXiv Detail & Related papers (2025-05-22T08:26:19Z)
Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition [64.56321246196859]
We propose a novel dyNamically Evolving dUal skeleton-semantic syneRgistic framework. We first construct the spatial-temporal evolving micro-prototypes and integrate dynamic context-aware side information. We introduce the spatial compression and temporal memory mechanisms to guide the growth of spatial-temporal micro-prototypes.
arXiv Detail & Related papers (2024-11-18T05:16:11Z)
Synergistic Spotting and Recognition of Micro-Expression via Temporal State Transition [12.087992699513213]
The analysis of micro-expressions generally involves two main tasks: spotting micro-expression intervals in long videos and recognizing the emotions associated with these intervals. Previous deep learning methods have primarily relied on classification networks utilizing sliding windows. We present a novel temporal state transition architecture grounded in the state space model, which replaces conventional window-level classification with video-level regression.
arXiv Detail & Related papers (2024-09-15T12:14:19Z)
Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion. Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z)
Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition [21.675660978188617]
Micro-expression recognition is crucial in many fields, including criminal analysis and psychotherapy. A three-stream temporal-shift attention network based on self-knowledge distillation called SKD-TSTSAN is proposed in this paper.
arXiv Detail & Related papers (2024-06-25T13:22:22Z)
Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition [48.21696443824074]
We propose a novel framework for micro-expression recognition, named the Adaptive Temporal Motion Guided Graph Convolution Network (ATM-GCN) Our framework excels at capturing temporal dependencies between frames across the entire clip, thereby enhancing micro-expression recognition at the clip level.
arXiv Detail & Related papers (2024-06-13T10:57:24Z)
Transferring Dual Stochastic Graph Convolutional Network for Facial Micro-expression Recognition [7.62031665958404]
This paper presents a transferring dual Graph Convolutional Network (GCN) model. We propose a graph construction method and dual graph convolutional network to extract more discriminative features from the micro-expression images. Our proposed method achieves state-of-the-art performance on recently released MMEW benchmarks.
arXiv Detail & Related papers (2022-03-10T07:41:18Z)
Video-based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms [52.58031087639394]
micro-expressions are involuntary and transient facial expressions. They can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, their detection and recognition is difficult and relies heavily on expert experiences.
arXiv Detail & Related papers (2022-01-30T05:14:13Z)
Progressive Spatio-Temporal Bilinear Network with Monte Carlo Dropout for Landmark-based Facial Expression Recognition with Uncertainty Estimation [93.73198973454944]
The performance of our method is evaluated on three widely used datasets. It is comparable to that of video-based state-of-the-art methods while it has much less complexity.
arXiv Detail & Related papers (2021-06-08T13:40:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.