Related papers: PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition

PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition

URL: http://arxiv.org/abs/2504.17163v1
Date: Thu, 24 Apr 2025 00:48:03 GMT
Title: PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition
Authors: Kai Cui, Jia Li, Yu Liu, Xuesong Zhang, Zhenzhen Hu, Meng Wang,
Abstract summary: We propose PhysioSync, a novel pre-training framework leveraging temporal and cross-modal contrastive learning.<n>After pre-training, cross-resolution and cross-modal features are hierarchically fused and fine-tuned to enhance emotion recognition.<n> Experiments on DEAP and DREAMER datasets demonstrate PhysioSync's advanced performance under uni-modal and cross-modal conditions.
Score: 26.384133051131133
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Electroencephalography (EEG) signals provide a promising and involuntary reflection of brain activity related to emotional states, offering significant advantages over behavioral cues like facial expressions. However, EEG signals are often noisy, affected by artifacts, and vary across individuals, complicating emotion recognition. While multimodal approaches have used Peripheral Physiological Signals (PPS) like GSR to complement EEG, they often overlook the dynamic synchronization and consistent semantics between the modalities. Additionally, the temporal dynamics of emotional fluctuations across different time resolutions in PPS remain underexplored. To address these challenges, we propose PhysioSync, a novel pre-training framework leveraging temporal and cross-modal contrastive learning, inspired by physiological synchronization phenomena. PhysioSync incorporates Cross-Modal Consistency Alignment (CM-CA) to model dynamic relationships between EEG and complementary PPS, enabling emotion-related synchronizations across modalities. Besides, it introduces Long- and Short-Term Temporal Contrastive Learning (LS-TCL) to capture emotional synchronization at different temporal resolutions within modalities. After pre-training, cross-resolution and cross-modal features are hierarchically fused and fine-tuned to enhance emotion recognition. Experiments on DEAP and DREAMER datasets demonstrate PhysioSync's advanced performance under uni-modal and cross-modal conditions, highlighting its effectiveness for EEG-centered emotion recognition.

Related papers

CAST-Phys: Contactless Affective States Through Physiological signals Database [74.28082880875368]
The lack of affective multi-modal datasets remains a major bottleneck in developing accurate emotion recognition systems.<n>We present the Contactless Affective States Through Physiological Signals Database (CAST-Phys), a novel high-quality dataset capable of remote physiological emotion recognition.<n>Our analysis highlights the crucial role of physiological signals in realistic scenarios where facial expressions alone may not provide sufficient emotional information.
arXiv Detail & Related papers (2025-07-08T15:20:24Z)
FreqDGT: Frequency-Adaptive Dynamic Graph Networks with Transformer for Cross-subject EEG Emotion Recognition [1.9198890060313585]
Cross-subject generalization is a challenge due to individual variability, cognitive traits, and emotional responses.<n>We propose FreqDGT, a frequency-adaptive dynamic graph transformer that addresses these limitations through an integrated framework.<n>FreqDGT significantly improves cross-subject emotion recognition accuracy, confirming the effectiveness of integrating frequency-adaptive, spatial-dynamic, and temporal-hierarchical modeling.
arXiv Detail & Related papers (2025-06-28T08:18:05Z)
PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing [49.243031514520794]
Large Language Models (LLMs) excel at capturing long-range signals due to their text-centric design.<n>PhysLLM achieves state-the-art accuracy and robustness, demonstrating superior generalization across lighting variations and motion scenarios.
arXiv Detail & Related papers (2025-05-06T15:18:38Z)
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation [63.94836524433559]
DICE-Talk is a framework for disentangling identity with emotion and cooperating emotions with similar characteristics. We develop a disentangled emotion embedder that jointly models audio-visual emotional cues through cross-modal attention. Second, we introduce a correlation-enhanced emotion conditioning module with learnable Emotion Banks. Third, we design an emotion discrimination objective that enforces affective consistency during the diffusion process.
arXiv Detail & Related papers (2025-04-25T05:28:21Z)
Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset. For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset. We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z)
PHemoNet: A Multimodal Network for Physiological Signals [9.54382727022316]
We introduce PHemoNet, a fully hypercomplex network for multimodal emotion recognition from physiological signals. The architecture comprises modality-specific encoders and a fusion module. The proposed method outperforms current state-of-the-art models on the MAHNOB-HCI dataset.
arXiv Detail & Related papers (2024-09-13T21:14:27Z)
Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition [18.65975882665568]
Depression based on physiological signals such as functional near-infrared spectroscopy (NIRS) and electroencephalogram (EEG) has made considerable progress. In this paper, we introduce a multimodal physiological signals representation learning framework using architecture via multiscale contrasting for depression recognition (MRLM) To enhance the learning of semantic representation associated with stimulation tasks, a semantic contrast module is proposed.
arXiv Detail & Related papers (2024-06-22T09:28:02Z)
Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition [23.505616142198487]
We develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition. The model learns universal latent representations of EEG signals through pre-training on large scale dataset. Extensive experiments on public datasets demonstrate Mood Reader's superior performance in cross-subject emotion recognition tasks.
arXiv Detail & Related papers (2024-05-28T14:31:11Z)
Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model. We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE) This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z)
Persistent-Transient Duality: A Multi-mechanism Approach for Modeling Human-Object Interaction [58.67761673662716]
Humans are highly adaptable, swiftly switching between different modes to handle different tasks, situations and contexts. In Human-object interaction (HOI) activities, these modes can be attributed to two mechanisms: (1) the large-scale consistent plan for the whole activity and (2) the small-scale children interactive actions that start and end along the timeline. This work proposes to model two concurrent mechanisms that jointly control human motion.
arXiv Detail & Related papers (2023-07-24T12:21:33Z)
fMRI from EEG is only Deep Learning away: the use of interpretable DL to unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data. We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z)
Progressive Graph Convolution Network for EEG Emotion Recognition [35.08010382523394]
Studies in the area of neuroscience have revealed the relationship between emotional patterns and brain functional regions. In EEG emotion recognition, we can observe that clearer boundaries exist between coarse-grained emotions than those between fine-grained emotions. We propose a progressive graph convolution network (PGCN) for capturing this inherent characteristic in EEG emotional signals.
arXiv Detail & Related papers (2021-12-14T03:30:13Z)
Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition [9.07006689672858]
We propose Contrast Learning method for Inter-Subject Alignment (ISA) for reliable cross-subject emotion recognition. ISA involves maximizing the similarity in EEG signals across subjects when they received the same stimuli in contrast to different ones. A convolutional neural network with depthwise spatial convolution and temporal convolution layers was applied to learn inter-subject representations from raw EEG signals.
arXiv Detail & Related papers (2021-09-20T14:13:45Z)
Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations. We then use the distilled physiological features for robust multi-task physiological measurements. The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.