Related papers: Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition

Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition

URL: http://arxiv.org/abs/2507.14698v1
Date: Sat, 19 Jul 2025 17:23:38 GMT
Title: Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition
Authors: Xuetao Lin, Tianhao Peng, Peihong Dai, Yu Liang, Wenjun Wu,
Abstract summary: SST-CL is a novel framework integrating spatial-temporal transformers with curriculum learning.<n>An intensity-aware curriculum learning strategy guides training from high-intensity to low-intensity emotional states.<n>Experiments on three benchmark datasets demonstrate state-of-the-art performance across various emotional intensity levels.
Score: 2.847161275680418
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: EEG-based emotion recognition plays an important role in developing adaptive brain-computer communication systems, yet faces two fundamental challenges in practical implementations: (1) effective integration of non-stationary spatial-temporal neural patterns, (2) robust adaptation to dynamic emotional intensity variations in real-world scenarios. This paper proposes SST-CL, a novel framework integrating spatial-temporal transformers with curriculum learning. Our method introduces two core components: a spatial encoder that models inter-channel relationships and a temporal encoder that captures multi-scale dependencies through windowed attention mechanisms, enabling simultaneous extraction of spatial correlations and temporal dynamics from EEG signals. Complementing this architecture, an intensity-aware curriculum learning strategy progressively guides training from high-intensity to low-intensity emotional states through dynamic sample scheduling based on a dual difficulty assessment. Comprehensive experiments on three benchmark datasets demonstrate state-of-the-art performance across various emotional intensity levels, with ablation studies confirming the necessity of both architectural components and the curriculum learning mechanism.

Related papers

Decomposing the Entropy-Performance Exchange: The Missing Keys to Unlocking Effective Reinforcement Learning [106.68304931854038]
Reinforcement learning with verifiable rewards (RLVR) has been widely used for enhancing the reasoning abilities of large language models (LLMs)<n>We conduct a systematic empirical analysis of the entropy-performance exchange mechanism of RLVR across different levels of granularity.<n>Our analysis reveals that, in the rising stage, entropy reduction in negative samples facilitates the learning of effective reasoning patterns.<n>In the plateau stage, learning efficiency strongly correlates with high-entropy tokens present in low-perplexity samples and those located at the end of sequences.
arXiv Detail & Related papers (2025-08-04T10:08:10Z)
Confidence-driven Gradient Modulation for Multimodal Human Activity Recognition: A Dynamic Contrastive Dual-Path Learning Approach [3.0868241505670198]
We propose a novel framework called the Dynamic Contrastive Dual-Path Network (D-HAR)<n>The framework comprises three key components. First, a dual-path feature extraction architecture is employed, where ResNet and DenseCDPNet branches collaboratively process multimodal sensor data.<n>Second, a multi-stage contrastive learning mechanism is introduced to achieve progressive alignment from local perception to semantic abstraction.<n>Third, we present a confidence-driven gradient modulation strategy that dynamically monitors and adjusts the learning intensity of each modality branch during backpropagation.
arXiv Detail & Related papers (2025-07-03T17:37:46Z)
Zero-Shot EEG-to-Gait Decoding via Phase-Aware Representation Learning [9.49131859415923]
We propose NeuroDyGait, a domain-generalizable EEG-to-motion decoding framework.<n>It uses structured contrastive representation learning and relational domain modeling to achieve semantic alignment between EEG and motion embeddings.<n>It achieves zero-shot motion prediction for unseen individuals without requiring adaptation and superior performance in cross-subject gait decoding on benchmark datasets.
arXiv Detail & Related papers (2025-06-24T06:03:49Z)
CRIA: A Cross-View Interaction and Instance-Adapted Pre-training Framework for Generalizable EEG Representations [52.251569042852815]
CRIA is an adaptive framework that utilizes variable-length and variable-channel coding to achieve a unified representation of EEG data across different datasets.<n>The model employs a cross-attention mechanism to fuse temporal, spectral, and spatial features effectively.<n> Experimental results on the Temple University EEG corpus and the CHB-MIT dataset show that CRIA outperforms existing methods with the same pre-training conditions.
arXiv Detail & Related papers (2025-06-19T06:31:08Z)
SITE: towards Spatial Intelligence Thorough Evaluation [121.1493852562597]
Spatial intelligence (SI) represents a cognitive ability encompassing the visualization, manipulation, and reasoning about spatial relationships.<n>We introduce SITE, a benchmark dataset towards SI Thorough Evaluation.<n>Our approach to curating the benchmark combines a bottom-up survey about 31 existing datasets and a top-down strategy drawing upon three classification systems in cognitive science.
arXiv Detail & Related papers (2025-05-08T17:45:44Z)
Technical Approach for the EMI Challenge in the 8th Affective Behavior Analysis in-the-Wild Competition [10.741278852581646]
Emotional Mimicry Intensity (EMI) estimation plays a pivotal role in understanding human social behavior and advancing human-computer interaction.<n>This paper proposes a dual-stage cross-modal alignment framework to address the limitations of existing methods.<n> Experiments on the Hume-Vidmimic2 dataset demonstrate superior performance with an average Pearson coefficient correlation of 0.51 across six emotion dimensions.
arXiv Detail & Related papers (2025-03-13T17:46:16Z)
Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition [23.505616142198487]
We develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition. The model learns universal latent representations of EEG signals through pre-training on large scale dataset. Extensive experiments on public datasets demonstrate Mood Reader's superior performance in cross-subject emotion recognition tasks.
arXiv Detail & Related papers (2024-05-28T14:31:11Z)
Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification. We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z)
Joint Contrastive Learning with Feature Alignment for Cross-Corpus EEG-based Emotion Recognition [2.1645626994550664]
We propose a novel Joint Contrastive learning framework with Feature Alignment to address cross-corpus EEG-based emotion recognition. In the pre-training stage, a joint domain contrastive learning strategy is introduced to characterize generalizable time-frequency representations of EEG signals. In the fine-tuning stage, JCFA is refined in conjunction with downstream tasks, where the structural connections among brain electrodes are considered.
arXiv Detail & Related papers (2024-04-15T08:21:17Z)
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition [72.36055502078193]
We propose a hierarchical framework, based on chain regression models, for affective recognition from vocal bursts. To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules. The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE" tasks.
arXiv Detail & Related papers (2023-03-14T16:08:45Z)
fMRI from EEG is only Deep Learning away: the use of interpretable DL to unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data. We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z)
Cross-individual Recognition of Emotions by a Dynamic Entropy based on Pattern Learning with EEG features [2.863100352151122]
We propose a deep-learning framework denoted as a dynamic entropy-based pattern learning (DEPL) to abstract informative indicators pertaining to the neurophysiological features among multiple individuals. DEPL enhanced the capability of representations generated by a deep convolutional neural network by modelling the interdependencies between the cortical locations of dynamical entropy based features.
arXiv Detail & Related papers (2020-09-26T07:22:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.