Related papers: Micron-BERT: BERT-based Facial Micro-Expression Recognition

Micron-BERT: BERT-based Facial Micro-Expression Recognition

URL: http://arxiv.org/abs/2304.03195v1
Date: Thu, 6 Apr 2023 16:19:09 GMT
Title: Micron-BERT: BERT-based Facial Micro-Expression Recognition
Authors: Xuan-Bac Nguyen, Chi Nhan Duong, Xin Li, Susan Gauch, Han-Seok Seo, Khoa Luu
Abstract summary: Micron-BERT ($mu$-BERT) is a novel approach to facial micro-expression recognition. The proposed method can automatically capture these movements in an unsupervised manner. $mu$-BERT consistently outperforms state-of-the-art performance on four micro-expression benchmarks.
Score: 15.367299107839418
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Micro-expression recognition is one of the most challenging topics in affective computing. It aims to recognize tiny facial movements difficult for humans to perceive in a brief period, i.e., 0.25 to 0.5 seconds. Recent advances in pre-training deep Bidirectional Transformers (BERT) have significantly improved self-supervised learning tasks in computer vision. However, the standard BERT in vision problems is designed to learn only from full images or videos, and the architecture cannot accurately detect details of facial micro-expressions. This paper presents Micron-BERT ($\mu$-BERT), a novel approach to facial micro-expression recognition. The proposed method can automatically capture these movements in an unsupervised manner based on two key ideas. First, we employ Diagonal Micro-Attention (DMA) to detect tiny differences between two frames. Second, we introduce a new Patch of Interest (PoI) module to localize and highlight micro-expression interest regions and simultaneously reduce noisy backgrounds and distractions. By incorporating these components into an end-to-end deep network, the proposed $\mu$-BERT significantly outperforms all previous work in various micro-expression tasks. $\mu$-BERT can be trained on a large-scale unlabeled dataset, i.e., up to 8 million images, and achieves high accuracy on new unseen facial micro-expression datasets. Empirical experiments show $\mu$-BERT consistently outperforms state-of-the-art performance on four micro-expression benchmarks, including SAMM, CASME II, SMIC, and CASME3, by significant margins. Code will be available at \url{https://github.com/uark-cviu/Micron-BERT}

Related papers

CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis [51.27304044745634]
Large Language Models with Mixture-of-Experts (MoE) suffer from substantial computational and storage overheads.<n>We introduce micro-expert as a finer-grained compression unit that spans across matrices.<n>We propose CAMERA-P, a structured micro-expert pruning framework, and CAMERA-Q, a mixed-precision quantization idea designed for micro-experts.
arXiv Detail & Related papers (2025-08-04T11:42:48Z)
MEGC2025: Micro-Expression Grand Challenge on Spot Then Recognize and Visual Question Answering [55.30507585676142]
Facial micro-expressions (MEs) are involuntary movements of the face that occur spontaneously when a person experiences an emotion.<n>In recent years, substantial advancements have been made in the areas of ME recognition, spotting, and generation.<n>The ME grand challenge (MEGC) 2025 introduces two tasks that reflect these evolving research directions.
arXiv Detail & Related papers (2025-06-18T09:29:51Z)
Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition [48.21696443824074]
We propose a novel framework for micro-expression recognition, named the Adaptive Temporal Motion Guided Graph Convolution Network (ATM-GCN) Our framework excels at capturing temporal dependencies between frames across the entire clip, thereby enhancing micro-expression recognition at the clip level.
arXiv Detail & Related papers (2024-06-13T10:57:24Z)
From Macro to Micro: Boosting micro-expression recognition via pre-training on macro-expression videos [9.472210792839023]
Micro-expression recognition (MER) has drawn increasing attention in recent years due to its potential applications in intelligent medical and lie detection. We propose a generalized transfer learning paradigm, called textbfMAcro-expression textbfTO textbfMIcro-expression (MA2MI) Under our paradigm, networks can learns the ability to represent subtle facial movement by reconstructing future frames.
arXiv Detail & Related papers (2024-05-26T06:42:06Z)
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts [60.1586169973792]
M$3$ViT is the latest multi-task ViT model that introduces mixture-of-experts (MoE) MoE achieves better accuracy and over 80% reduction computation but leaves challenges for efficient deployment on FPGA. Our work, dubbed Edge-MoE, solves the challenges to introduce the first end-to-end FPGA accelerator for multi-task ViT with a collection of architectural innovations.
arXiv Detail & Related papers (2023-05-30T02:24:03Z)
Micro-Expression Recognition Based on Attribute Information Embedding and Cross-modal Contrastive Learning [22.525295392858293]
We propose a micro-expression recognition method based on attribute information embedding and cross-modal contrastive learning. We conduct extensive experiments in CASME II and MMEW databases, and the accuracy is 77.82% and 71.04%, respectively.
arXiv Detail & Related papers (2022-05-29T12:28:10Z)
Video-based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms [52.58031087639394]
micro-expressions are involuntary and transient facial expressions. They can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, their detection and recognition is difficult and relies heavily on expert experiences.
arXiv Detail & Related papers (2022-01-30T05:14:13Z)
Short and Long Range Relation Based Spatio-Temporal Transformer for Micro-Expression Recognition [61.374467942519374]
We propose a novel a-temporal transformer architecture -- to the best of our knowledge, the first purely transformer based approach for micro-expression recognition. The architecture comprises a spatial encoder which learns spatial patterns, a temporal dimension classification for temporal analysis, and a head. A comprehensive evaluation on three widely used spontaneous micro-expression data sets, shows that the proposed approach consistently outperforms the state of the art.
arXiv Detail & Related papers (2021-12-10T22:10:31Z)
Action Units That Constitute Trainable Micro-expressions (and A Large-scale Synthetic Dataset) [20.866448615388876]
We aim to develop a protocol to automatically synthesize micro-expression training data on a large scale. Specifically, we discover three types of Action Units (AUs) that can well constitute trainable micro-expressions. With these AUs, our protocol employs large numbers of face images with various identities and an existing face generation method for micro-expression synthesis. Micro-expression recognition models are trained on the generated micro-expression datasets and evaluated on real-world test sets.
arXiv Detail & Related papers (2021-12-03T06:09:06Z)
Micro-expression spotting: A new benchmark [74.69928316848866]
Micro-expressions (MEs) are brief and involuntary facial expressions that occur when people are trying to hide their true feelings or conceal their emotions. In the computer vision field, the study of MEs can be divided into two main tasks, spotting and recognition. This paper introduces an extension of the SMIC-E database, namely the SMIC-E-Long database, which is a new challenging benchmark for ME spotting.
arXiv Detail & Related papers (2020-07-24T09:18:41Z)
Predicting the Popularity of Micro-videos with Multimodal Variational Encoder-Decoder Framework [54.194340961353944]
We propose a multimodal variational encoder-decoder framework for micro-video popularity tasks. MMVED learns a prediction embedding of a micro-video that is informative to its popularity level. Experiments conducted on a public dataset and a dataset we collect from Xigua demonstrate the effectiveness of the proposed MMVED framework.
arXiv Detail & Related papers (2020-03-28T06:08:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.