MMNet: Muscle motion-guided network for micro-expression recognition
- URL: http://arxiv.org/abs/2201.05297v1
- Date: Fri, 14 Jan 2022 04:05:49 GMT
- Title: MMNet: Muscle motion-guided network for micro-expression recognition
- Authors: Hanting Li, Mingzhe Sui, Zhaoqing Zhu, Feng Zhao
- Abstract summary: We propose a robust micro-expression recognition framework, namely muscle motion-guided network (MMNet)
Specifically, a continuous attention (CA) block is introduced to focus on modeling local subtle muscle motion patterns with little identity information.
Our approach outperforms state-of-the-art methods by a large margin.
- Score: 2.032432845751978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial micro-expressions (MEs) are involuntary facial motions revealing
peoples real feelings and play an important role in the early intervention of
mental illness, the national security, and many human-computer interaction
systems. However, existing micro-expression datasets are limited and usually
pose some challenges for training good classifiers. To model the subtle facial
muscle motions, we propose a robust micro-expression recognition (MER)
framework, namely muscle motion-guided network (MMNet). Specifically, a
continuous attention (CA) block is introduced to focus on modeling local subtle
muscle motion patterns with little identity information, which is different
from most previous methods that directly extract features from complete video
frames with much identity information. Besides, we design a position
calibration (PC) module based on the vision transformer. By adding the position
embeddings of the face generated by PC module at the end of the two branches,
the PC module can help to add position information to facial muscle motion
pattern features for the MER. Extensive experiments on three public
micro-expression datasets demonstrate that our approach outperforms
state-of-the-art methods by a large margin.
Related papers
- Micro-Expression Recognition by Motion Feature Extraction based on Pre-training [6.015288149235598]
We propose a novel motion extraction strategy (MoExt) for the micro-expression recognition task.
In MoExt, shape features and texture features are first extracted separately from onset and apex frames, and then motion features related to MEs are extracted based on shape features of both frames.
The effectiveness of proposed method is validated on three commonly used datasets.
arXiv Detail & Related papers (2024-07-10T03:51:34Z) - Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition [21.675660978188617]
Micro-expression recognition is crucial in many fields, including criminal analysis and psychotherapy.
A three-stream temporal-shift attention network based on self-knowledge distillation called SKD-TSTSAN is proposed in this paper.
arXiv Detail & Related papers (2024-06-25T13:22:22Z) - Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition [48.21696443824074]
We propose a novel framework for micro-expression recognition, named the Adaptive Temporal Motion Guided Graph Convolution Network (ATM-GCN)
Our framework excels at capturing temporal dependencies between frames across the entire clip, thereby enhancing micro-expression recognition at the clip level.
arXiv Detail & Related papers (2024-06-13T10:57:24Z) - From Macro to Micro: Boosting micro-expression recognition via pre-training on macro-expression videos [9.472210792839023]
Micro-expression recognition (MER) has drawn increasing attention in recent years due to its potential applications in intelligent medical and lie detection.
We propose a generalized transfer learning paradigm, called textbfMAcro-expression textbfTO textbfMIcro-expression (MA2MI)
Under our paradigm, networks can learns the ability to represent subtle facial movement by reconstructing future frames.
arXiv Detail & Related papers (2024-05-26T06:42:06Z) - Facial Prior Based First Order Motion Model for Micro-expression
Generation [11.27890186026442]
This paper tries to formulate a new task called micro-expression generation.
It combines the first order motion model with facial prior knowledge.
Given a target face, we intend to drive the face to generate micro-expression videos according to the motion patterns of source videos.
arXiv Detail & Related papers (2023-08-08T18:57:03Z) - Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person
Re-identification [78.08536797239893]
We propose a novel Multi-Stage Spatial-Temporal Aggregation Transformer (MSTAT) with two novel designed proxy embedding modules.
MSTAT consists of three stages to encode the attribute-associated, the identity-associated, and the attribute-identity-associated information from the video clips.
We show that MSTAT can achieve state-of-the-art accuracies on various standard benchmarks.
arXiv Detail & Related papers (2023-01-02T05:17:31Z) - Video-based Facial Micro-Expression Analysis: A Survey of Datasets,
Features and Algorithms [52.58031087639394]
micro-expressions are involuntary and transient facial expressions.
They can provide important information in a broad range of applications such as lie detection, criminal detection, etc.
Since micro-expressions are transient and of low intensity, their detection and recognition is difficult and relies heavily on expert experiences.
arXiv Detail & Related papers (2022-01-30T05:14:13Z) - Short and Long Range Relation Based Spatio-Temporal Transformer for
Micro-Expression Recognition [61.374467942519374]
We propose a novel a-temporal transformer architecture -- to the best of our knowledge, the first purely transformer based approach for micro-expression recognition.
The architecture comprises a spatial encoder which learns spatial patterns, a temporal dimension classification for temporal analysis, and a head.
A comprehensive evaluation on three widely used spontaneous micro-expression data sets, shows that the proposed approach consistently outperforms the state of the art.
arXiv Detail & Related papers (2021-12-10T22:10:31Z) - Pose-Controllable Talking Face Generation by Implicitly Modularized
Audio-Visual Representation [96.66010515343106]
We propose a clean yet effective framework to generate pose-controllable talking faces.
We operate on raw face images, using only a single photo as an identity reference.
Our model has multiple advanced capabilities including extreme view robustness and talking face frontalization.
arXiv Detail & Related papers (2021-04-22T15:10:26Z) - Shape My Face: Registering 3D Face Scans by Surface-to-Surface
Translation [75.59415852802958]
Shape-My-Face (SMF) is a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model.
Our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets.
arXiv Detail & Related papers (2020-12-16T20:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.