One Model for All: Universal Pre-training for EEG based Emotion Recognition across Heterogeneous Datasets and Paradigms
- URL: http://arxiv.org/abs/2511.08444v1
- Date: Wed, 12 Nov 2025 01:58:59 GMT
- Title: One Model for All: Universal Pre-training for EEG based Emotion Recognition across Heterogeneous Datasets and Paradigms
- Authors: Xiang Li, You Li, Yazhou Zhang,
- Abstract summary: 'One Model for All' is a universal pre-training framework for EEG analysis across disparate datasets.<n>Our framework achieves new SOTA performance on all within-subject benchmarks: SEED (99.27%), DEAP (93.69%), and DREAMER (93.93%)<n>This work paves the way for more universal, scalable, and effective pre-trained models for diverse EEG analysis tasks.
- Score: 9.873322204941394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: EEG-based emotion recognition is hampered by profound dataset heterogeneity (channel/subject variability), hindering generalizable models. Existing approaches struggle to transfer knowledge effectively. We propose 'One Model for All', a universal pre-training framework for EEG analysis across disparate datasets. Our paradigm decouples learning into two stages: (1) Univariate pre-training via self-supervised contrastive learning on individual channels, enabled by a Unified Channel Schema (UCS) that leverages the channel union (e.g., SEED-62ch, DEAP-32ch); (2) Multivariate fine-tuning with a novel 'ART' (Adaptive Resampling Transformer) and 'GAT' (Graph Attention Network) architecture to capture complex spatio-temporal dependencies. Experiments show universal pre-training is an essential stabilizer, preventing collapse on SEED (vs. scratch) and yielding substantial gains on DEAP (+7.65%) and DREAMER (+3.55%). Our framework achieves new SOTA performance on all within-subject benchmarks: SEED (99.27%), DEAP (93.69%), and DREAMER (93.93%). We also show SOTA cross-dataset transfer, achieving 94.08% (intersection) and 93.05% (UCS) on the unseen DREAMER dataset, with the former surpassing the within-domain pre-training benchmark. Ablation studies validate our architecture: the GAT module is critical, yielding a +22.19% gain over GCN on the high-noise DEAP dataset, and its removal causes a catastrophic -16.44% performance drop. This work paves the way for more universal, scalable, and effective pre-trained models for diverse EEG analysis tasks.
Related papers
- AQCat25: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis [0.0]
We introduce AQCat25, a complementary dataset of 13.5 million density functional theory (DFT) single point calculations.<n>We investigate methodologies for integrating new datasets, such as AQCat25, with the broader Open Catalyst 2020 (OC20) dataset.<n>We show that explicitly conditioning the model on this system-specific metadata, for example by using Feature-wise Linear Modulation (FiLM), successfully addresses this challenge.
arXiv Detail & Related papers (2025-10-27T02:47:20Z) - Multi-dataset Joint Pre-training of Emotional EEG Enables Generalizable Affective Computing [5.116264249622881]
Existing EEG models struggle with complex tasks like emotion recognition due to mismatches between task-specific features and broad pre-training approaches.<n>This work aims to develop a task-specific multi-dataset joint pre-training framework for cross-dataset emotion recognition.
arXiv Detail & Related papers (2025-10-25T07:30:24Z) - Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z) - A Novel Hybrid Deep Learning Technique for Speech Emotion Detection using Feature Engineering [0.0]
Our proposed DCRF-BiLSTM model is used to recognize seven emotions: neutral, happy, sad, angry, fear, disgust, and surprise.<n>The model achieves high accuracy on individual datasets, including 97.83% on RAVDESS.<n>For the combined (R+T+S) datasets, it achieves 98.82% accuracy, outperforming previously reported results.
arXiv Detail & Related papers (2025-07-09T17:07:45Z) - Is Architectural Complexity Overrated? Competitive and Interpretable Knowledge Graph Completion with RelatE [6.959701672059059]
RelatE is an interpretable and modular method that efficiently integrates dual representations for entities and relations.<n>It achieves competitive or superior performance on standard benchmarks.<n>Perturbation studies demonstrate improved robustness, with MRR reduced by up to 61% relative to TransE and by up to 19% compared to RotatE.
arXiv Detail & Related papers (2025-05-25T04:36:52Z) - AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection [0.0]
We propose a novel two-tier ensemble framework for deepfake detection based on deep learning.<n>Our framework employs a unique approach where each architecture is instantiated three times.<n>Experiments achieved state-of-the-art intra-dataset performance.
arXiv Detail & Related papers (2025-05-01T05:14:50Z) - Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training [73.90260246781435]
We present Lory, the first approach that scales such architectures to autoregressive language model pre-training.
We show significant performance gains over parameter-matched dense models on both perplexity and a variety of downstream tasks.
Despite segment-level routing, Lory models achieve competitive performance compared to state-of-the-art MoE models with token-level routing.
arXiv Detail & Related papers (2024-05-06T03:06:33Z) - One-for-All: Bridge the Gap Between Heterogeneous Architectures in
Knowledge Distillation [69.65734716679925]
Knowledge distillation has proven to be a highly effective approach for enhancing model performance through a teacher-student training scheme.
Most existing distillation methods are designed under the assumption that the teacher and student models belong to the same model family.
We propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.
arXiv Detail & Related papers (2023-10-30T11:13:02Z) - Scaling Data Generation in Vision-and-Language Navigation [116.95534559103788]
We propose an effective paradigm for generating large-scale data for learning.
We apply 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs.
Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning.
arXiv Detail & Related papers (2023-07-28T16:03:28Z) - ContrasInver: Ultra-Sparse Label Semi-supervised Regression for
Multi-dimensional Seismic Inversion [7.356328937024184]
ContrasInver is a method that achieves seismic inversion using as few as two or three well logs.
In experiments, ContrasInver achieved state-of-the-art performance in the synthetic data SEAM I.
It's the first data-driven approach yielding reliable results on the Netherlands F3 and Delft, using only three and two well logs respectively.
arXiv Detail & Related papers (2023-02-13T15:19:51Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.