Related papers: Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition

Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition

URL: http://arxiv.org/abs/2406.17538v2
Date: Mon, 29 Jul 2024 05:11:12 GMT
Title: Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition
Authors: Guanghao Zhu, Lin Liu, Yuhao Hu, Haixin Sun, Fang Liu, Xiaohui Du, Ruqian Hao, Juanxiu Liu, Yong Liu, Hao Deng, Jing Zhang,
Abstract summary: Micro-expression recognition is crucial in many fields, including criminal analysis and psychotherapy. A three-stream temporal-shift attention network based on self-knowledge distillation called SKD-TSTSAN is proposed in this paper.
Score: 21.675660978188617
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Micro-expressions are subtle facial movements that occur spontaneously when people try to conceal real emotions. Micro-expression recognition is crucial in many fields, including criminal analysis and psychotherapy. However, micro-expression recognition is challenging since micro-expressions have low intensity and public datasets are small in size. To this end, a three-stream temporal-shift attention network based on self-knowledge distillation called SKD-TSTSAN is proposed in this paper. Firstly, to address the low intensity of muscle movements, we utilize learning-based motion magnification modules to enhance the intensity of muscle movements. Secondly, we employ efficient channel attention modules in the local-spatial stream to make the network focus on facial regions that are highly relevant to micro-expressions. In addition, temporal shift modules are used in the dynamic-temporal stream, which enables temporal modeling with no additional parameters by mixing motion information from two different temporal domains. Furthermore, we introduce self-knowledge distillation into the micro-expression recognition task by introducing auxiliary classifiers and using the deepest section of the network for supervision, encouraging all blocks to fully explore the features of the training set. Finally, extensive experiments are conducted on four public datasets: CASME II, SAMM, MMEW, and CAS(ME)3. The experimental results demonstrate that our SKD-TSTSAN outperforms other existing methods and achieves new state-of-the-art performance. Our code will be available at https://github.com/GuanghaoZhu663/SKD-TSTSAN.

Related papers

Temporal and Spatial Feature Fusion Framework for Dynamic Micro Expression Recognition [5.444324424467006]
Transient and highly localised micro-expressions pose a significant challenge to their accurate recognition.<n>The accuracy rate of micro-expression recognition is as low as 50%, even for professionals.<n>We propose a novel Temporal and Spatial feature Fusion framework for DMER (TSFmicro)
arXiv Detail & Related papers (2025-05-22T08:26:19Z)
Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition [48.21696443824074]
We propose a novel framework for micro-expression recognition, named the Adaptive Temporal Motion Guided Graph Convolution Network (ATM-GCN) Our framework excels at capturing temporal dependencies between frames across the entire clip, thereby enhancing micro-expression recognition at the clip level.
arXiv Detail & Related papers (2024-06-13T10:57:24Z)
Masked Motion Predictors are Strong 3D Action Representation Learners [143.9677635274393]
In 3D human action recognition, limited supervised data makes it challenging to fully tap into the modeling potential of powerful networks such as transformers. We show that instead of following the prevalent pretext to perform masked self-component reconstruction in human joints, explicit contextual motion modeling is key to the success of learning effective feature representation for 3D action recognition.
arXiv Detail & Related papers (2023-08-14T11:56:39Z)
Three-dimensional microstructure generation using generative adversarial neural networks in the context of continuum micromechanics [77.34726150561087]
This work proposes a generative adversarial network tailored towards three-dimensional microstructure generation. The lightweight algorithm is able to learn the underlying properties of the material from a single microCT-scan without the need of explicit descriptors.
arXiv Detail & Related papers (2022-05-31T13:26:51Z)
Micro-Expression Recognition Based on Attribute Information Embedding and Cross-modal Contrastive Learning [22.525295392858293]
We propose a micro-expression recognition method based on attribute information embedding and cross-modal contrastive learning. We conduct extensive experiments in CASME II and MMEW databases, and the accuracy is 77.82% and 71.04%, respectively.
arXiv Detail & Related papers (2022-05-29T12:28:10Z)
Video-based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms [52.58031087639394]
micro-expressions are involuntary and transient facial expressions. They can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, their detection and recognition is difficult and relies heavily on expert experiences.
arXiv Detail & Related papers (2022-01-30T05:14:13Z)
MMNet: Muscle motion-guided network for micro-expression recognition [2.032432845751978]
We propose a robust micro-expression recognition framework, namely muscle motion-guided network (MMNet) Specifically, a continuous attention (CA) block is introduced to focus on modeling local subtle muscle motion patterns with little identity information. Our approach outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-01-14T04:05:49Z)
Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition [61.374467942519374]
We propose a novel a-temporal transformer architecture -- to the best of our knowledge, the first purely transformer based approach for micro-expression recognition. The architecture comprises a spatial encoder which learns spatial patterns, a temporal dimension classification for temporal analysis, and a head. A comprehensive evaluation on three widely used spontaneous micro-expression data sets, shows that the proposed approach consistently outperforms the state of the art.
arXiv Detail & Related papers (2021-12-10T22:10:31Z)
Progressive Spatio-Temporal Bilinear Network with Monte Carlo Dropout for Landmark-based Facial Expression Recognition with Uncertainty Estimation [93.73198973454944]
The performance of our method is evaluated on three widely used datasets. It is comparable to that of video-based state-of-the-art methods while it has much less complexity.
arXiv Detail & Related papers (2021-06-08T13:40:30Z)
MERANet: Facial Micro-Expression Recognition using 3D Residual Attention Network [14.285700243381537]
We propose a facial-expression recognition model using 3D attention called MERANet. The proposed model also encompasses both spatial and temporal information. A superior performance is observed as compared to the state-of-the-art for facial micro-expression recognition.
arXiv Detail & Related papers (2020-12-07T16:41:42Z)
Multi-Temporal Convolutions for Human Action Recognition in Videos [83.43682368129072]
We present a novel temporal-temporal convolution block that is capable of extracting at multiple resolutions. The proposed blocks are lightweight and can be integrated into any 3D-CNN architecture.
arXiv Detail & Related papers (2020-11-08T10:40:26Z)
SMA-STN: Segmented Movement-Attending Spatiotemporal Network forMicro-Expression Recognition [20.166205708651194]
This paper proposes a segmented movement-attending network (SMA-STN) to reveal subtle movement changes visually in an efficient way. Extensive experiments on three widely used benchmarks, i.e., CALoss II, SAMM, and SHIC, show that the proposed SMA-STN achieves better MER performance than other state-of-the-art methods.
arXiv Detail & Related papers (2020-10-19T09:23:24Z)
Collaborative Distillation in the Parameter and Spectrum Domains for Video Action Recognition [79.60708268515293]
This paper explores how to train small and efficient networks for action recognition. We propose two distillation strategies in the frequency domain, namely the feature spectrum and parameter distribution distillations respectively. Our method can achieve higher performance than state-of-the-art methods with the same backbone.
arXiv Detail & Related papers (2020-09-15T07:29:57Z)
Non-Linearities Improve OrigiNet based on Active Imaging for Micro Expression Recognition [8.112868317921853]
We introduce an active imaging concept to segregate active changes in expressive regions of a video into a single frame. We propose a shallow CNN network: hybrid local receptive field based augmented learning network (OrigiNet) that efficiently learns significant features of the micro-expressions in a video.
arXiv Detail & Related papers (2020-05-16T13:44:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.