MPT: Motion Prompt Tuning for Micro-Expression Recognition
- URL: http://arxiv.org/abs/2508.09446v1
- Date: Wed, 13 Aug 2025 02:57:43 GMT
- Title: MPT: Motion Prompt Tuning for Micro-Expression Recognition
- Authors: Jiateng Liu, Hengcan Shi, Feng Chen, Zhiwen Shao, Yaonan Wang, Jianfei Cai, Wenming Zheng,
- Abstract summary: This paper introduces Motion Prompt Tuning (MPT) as a novel approach to adapting pre-training models for micro-expression recognition (MER)<n>MPT represents a pioneering method for subtle motion prompt tuning. Particularly, we introduce motion prompt generation, including motion magnification and Gaussian tokenization, to extract subtle motions as prompts for LMs.<n>Extensive experiments conducted on three widely used MER datasets demonstrate that our proposed MPT consistently surpasses state-of-the-art approaches.
- Score: 47.62949098749473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Micro-expression recognition (MER) is crucial in the affective computing field due to its wide application in medical diagnosis, lie detection, and criminal investigation. Despite its significance, obtaining micro-expression (ME) annotations is challenging due to the expertise required from psychological professionals. Consequently, ME datasets often suffer from a scarcity of training samples, severely constraining the learning of MER models. While current large pre-training models (LMs) offer general and discriminative representations, their direct application to MER is hindered by an inability to capture transitory and subtle facial movements-essential elements for effective MER. This paper introduces Motion Prompt Tuning (MPT) as a novel approach to adapting LMs for MER, representing a pioneering method for subtle motion prompt tuning. Particularly, we introduce motion prompt generation, including motion magnification and Gaussian tokenization, to extract subtle motions as prompts for LMs. Additionally, a group adapter is carefully designed and inserted into the LM to enhance it in the target MER domain, facilitating a more nuanced distinction of ME representation. Furthermore, extensive experiments conducted on three widely used MER datasets demonstrate that our proposed MPT consistently surpasses state-of-the-art approaches and verifies its effectiveness.
Related papers
- Improving Micro-Expression Recognition with Phase-Aware Temporal Augmentation [0.0]
Micro-expressions (MEs) are brief, involuntary facial movements that reveal genuine emotions, typically lasting less than half a second.<n>Deep learning has enabled significant advances in micro-expression recognition (MER), but its effectiveness is limited by the scarcity of annotated ME datasets.<n>This paper proposes a phase-aware temporal augmentation method based on dynamic image.
arXiv Detail & Related papers (2025-10-17T09:20:51Z) - Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models [49.435669307386156]
Multi-stage Prompt Refinement (MPR) is a framework designed to systematically improve ill-formed prompts across multiple stages.<n>MPR iteratively enhances the clarity of prompts with additional context and employs a self-reflection mechanism with ranking to prioritize the most relevant input.<n>Results on hallucination benchmarks show that MPR achieve over an 85% win rate compared to their original forms.
arXiv Detail & Related papers (2025-10-14T00:31:36Z) - AU-LLM: Micro-Expression Action Unit Detection via Enhanced LLM-Based Feature Fusion [26.058143518505805]
This paper introduces textbfAU-LLM, a novel framework that uses Large Language Models to detect micro-expression Action Units (AUs) in micro-expression datasets with subtle intensities and the scarcity of data.<n>We specifically address the critical vision-language semantic gap, the textbfEnhanced Fusion Projector (EFP). The EFP employs a Multi-Layer Perceptron (MLP) to intelligently fuse mid-level (local texture) and high-level (global semantics) visual features from a specialized 3D-CNN backbone into a single, information-dense token.
arXiv Detail & Related papers (2025-07-29T13:01:59Z) - MELLM: Exploring LLM-Powered Micro-Expression Understanding Enhanced by Subtle Motion Perception [47.80768014770871]
We propose a novel Micro-Expression Large Language Model (MELLM)<n>It incorporates a subtle facial motion perception strategy with the strong inference capabilities of MLLMs.<n>Our model exhibits superior robustness and generalization capabilities in micro-expression understanding (MEU)
arXiv Detail & Related papers (2025-05-11T15:08:23Z) - Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models [55.46269953415811]
We identify ToM-sensitive parameters and show that perturbing as little as 0.001% of these parameters significantly degrades ToM performance.<n>Our results have implications for enhancing model alignment, mitigating biases, and improving AI systems designed for human interaction.
arXiv Detail & Related papers (2025-04-05T17:45:42Z) - AMMSM: Adaptive Motion Magnification and Sparse Mamba for Micro-Expression Recognition [7.084377962617903]
We propose a multi-task learning framework named the Adaptive Motion Magnification and Sparse Mamba.<n>This framework aims to enhance the accurate capture of micro-expressions through self-supervised subtle motion magnification.<n>We employ evolutionary search to optimize the magnification factor and the sparsity ratios of spatial selection, followed by fine-tuning to improve performance further.
arXiv Detail & Related papers (2025-03-31T13:17:43Z) - Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition [21.675660978188617]
Micro-expression recognition is crucial in many fields, including criminal analysis and psychotherapy.<n>A three-stream temporal-shift attention network based on self-knowledge distillation is proposed in this paper.
arXiv Detail & Related papers (2024-06-25T13:22:22Z) - Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition [48.21696443824074]
We propose a novel framework for micro-expression recognition, named the Adaptive Temporal Motion Guided Graph Convolution Network (ATM-GCN)
Our framework excels at capturing temporal dependencies between frames across the entire clip, thereby enhancing micro-expression recognition at the clip level.
arXiv Detail & Related papers (2024-06-13T10:57:24Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Feature Representation Learning with Adaptive Displacement Generation
and Transformer Fusion for Micro-Expression Recognition [18.6490971645882]
Micro-expressions are spontaneous, rapid and subtle facial movements that can neither be forged nor suppressed.
We propose a novel framework Feature Representation Learning with adaptive Displacement Generation and Transformer fusion (FRL-DGT)
Experiments with solid leave-one-subject-out (LOSO) evaluation results have demonstrated the superiority of our proposed FRL-DGT to state-of-the-art methods.
arXiv Detail & Related papers (2023-04-10T07:03:36Z) - DFME: A New Benchmark for Dynamic Facial Micro-expression Recognition [51.26943074578153]
Micro-expression (ME) is a spontaneous, subtle, and transient facial expression that reveals human beings genuine emotion.<n>The ME data scarcity has severely hindered the development of advanced data-driven MER models.<n>In this paper, we overcome the ME data scarcity problem by collecting and annotating a dynamic spontaneous ME database.
arXiv Detail & Related papers (2023-01-03T07:33:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.