Learning to Rank Onset-Occurring-Offset Representations for
Micro-Expression Recognition
- URL: http://arxiv.org/abs/2310.04664v1
- Date: Sat, 7 Oct 2023 03:09:53 GMT
- Title: Learning to Rank Onset-Occurring-Offset Representations for
Micro-Expression Recognition
- Authors: Jie Zhu, Yuan Zong, Jingang Shi, Cheng Lu, Hongli Chang, Wenming Zheng
- Abstract summary: This paper focuses on the research of micro-expression recognition (MER)
It proposes a flexible and reliable deep learning method called learning to rank onset--offset representations (LTR3O)
- Score: 24.75382410411772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper focuses on the research of micro-expression recognition (MER) and
proposes a flexible and reliable deep learning method called learning to rank
onset-occurring-offset representations (LTR3O). The LTR3O method introduces a
dynamic and reduced-size sequence structure known as 3O, which consists of
onset, occurring, and offset frames, for representing micro-expressions (MEs).
This structure facilitates the subsequent learning of ME-discriminative
features. A noteworthy advantage of the 3O structure is its flexibility, as the
occurring frame is randomly extracted from the original ME sequence without the
need for accurate frame spotting methods. Based on the 3O structures, LTR3O
generates multiple 3O representation candidates for each ME sample and
incorporates well-designed modules to measure and calibrate their emotional
expressiveness. This calibration process ensures that the distribution of these
candidates aligns with that of macro-expressions (MaMs) over time.
Consequently, the visibility of MEs can be implicitly enhanced, facilitating
the reliable learning of more discriminative features for MER. Extensive
experiments were conducted to evaluate the performance of LTR3O using three
widely-used ME databases: CASME II, SMIC, and SAMM. The experimental results
demonstrate the effectiveness and superior performance of LTR3O, particularly
in terms of its flexibility and reliability, when compared to recent
state-of-the-art MER methods.
Related papers
- Multi-threshold Deep Metric Learning for Facial Expression Recognition [60.26967776920412]
We present the multi-threshold deep metric learning technique, which avoids the difficult threshold validation.
We find that each threshold of the triplet loss intrinsically determines a distinctive distribution of inter-class variations.
It makes the embedding layer, which is composed of a set of slices, a more informative and discriminative feature.
arXiv Detail & Related papers (2024-06-24T08:27:31Z) - Continual Referring Expression Comprehension via Dual Modular
Memorization [133.46886428655426]
Referring Expression (REC) aims to localize an image region of a given object described by a natural-language expression.
Existing REC algorithms make a strong assumption that training data feeding into a model are given upfront, which degrades its practicality for real-world scenarios.
In this paper, we propose Continual Referring Expression (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks.
In order to continuously improve the model on sequential tasks without forgetting prior learned knowledge and without repeatedly re-training from a scratch, we propose an effective baseline method named Dual Modular Memorization
arXiv Detail & Related papers (2023-11-25T02:58:51Z) - Extending Multi-modal Contrastive Representations [53.923340739349314]
Multimodal contrastive representation (MCR) of more than three modalities is critical in multi-modal learning.
Inspired by recent C-MCR, this paper proposes Extending Multimodal Contrastive Representation (Ex-MCR)
Ex-MCR is a training-efficient and paired-data-free method to flexibly learn unified contrastive representation space for more than three modalities.
arXiv Detail & Related papers (2023-10-13T06:34:23Z) - Feature Representation Learning with Adaptive Displacement Generation
and Transformer Fusion for Micro-Expression Recognition [18.6490971645882]
Micro-expressions are spontaneous, rapid and subtle facial movements that can neither be forged nor suppressed.
We propose a novel framework Feature Representation Learning with adaptive Displacement Generation and Transformer fusion (FRL-DGT)
Experiments with solid leave-one-subject-out (LOSO) evaluation results have demonstrated the superiority of our proposed FRL-DGT to state-of-the-art methods.
arXiv Detail & Related papers (2023-04-10T07:03:36Z) - Effective and Stable Role-Based Multi-Agent Collaboration by Structural
Information Principles [24.49065333729887]
We propose a mathematical Structural Information principles-based Role Discovery method, namely SIRD, for role discovery.
We then present a SIRD optimizing Multi-Agent Reinforcement Learning framework, namely SR-MARL, for multi-agent collaboration.
Specifically, the SIRD consists of structuralization, sparsification, and optimization modules, where an optimal encoding tree is generated to perform abstracting to discover roles.
arXiv Detail & Related papers (2023-04-03T07:13:44Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Short and Long Range Relation Based Spatio-Temporal Transformer for
Micro-Expression Recognition [61.374467942519374]
We propose a novel a-temporal transformer architecture -- to the best of our knowledge, the first purely transformer based approach for micro-expression recognition.
The architecture comprises a spatial encoder which learns spatial patterns, a temporal dimension classification for temporal analysis, and a head.
A comprehensive evaluation on three widely used spontaneous micro-expression data sets, shows that the proposed approach consistently outperforms the state of the art.
arXiv Detail & Related papers (2021-12-10T22:10:31Z) - MMD-ReID: A Simple but Effective Solution for Visible-Thermal Person
ReID [20.08880264104061]
We propose a simple but effective framework, MMD-ReID, that reduces the modality gap by an explicit discrepancy reduction constraint.
We conduct extensive experiments to demonstrate both qualitatively and quantitatively the effectiveness of MMD-ReID.
The proposed framework significantly outperforms the state-of-the-art methods on SYSU-MM01 and RegDB datasets.
arXiv Detail & Related papers (2021-11-09T11:33:32Z) - Data-Driven Learning of 3-Point Correlation Functions as Microstructure
Representations [8.978973486638253]
We show that a variety of microstructures can be characterized by a concise subset of three-point correlations.
The proposed representation can directly be used to compute material properties based on the effective medium theory.
arXiv Detail & Related papers (2021-09-06T06:15:57Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.