Related papers: One Model for All: Universal Pre-training for EEG based Emotion Recognition across Heterogeneous Datasets and Paradigms

One Model for All: Universal Pre-training for EEG based Emotion Recognition across Heterogeneous Datasets and Paradigms

URL: http://arxiv.org/abs/2511.08444v1
Date: Wed, 12 Nov 2025 01:58:59 GMT
Title: One Model for All: Universal Pre-training for EEG based Emotion Recognition across Heterogeneous Datasets and Paradigms
Authors: Xiang Li, You Li, Yazhou Zhang,
Abstract summary: 'One Model for All' is a universal pre-training framework for EEG analysis across disparate datasets.<n>Our framework achieves new SOTA performance on all within-subject benchmarks: SEED (99.27%), DEAP (93.69%), and DREAMER (93.93%)<n>This work paves the way for more universal, scalable, and effective pre-trained models for diverse EEG analysis tasks.
Score: 9.873322204941394
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: EEG-based emotion recognition is hampered by profound dataset heterogeneity (channel/subject variability), hindering generalizable models. Existing approaches struggle to transfer knowledge effectively. We propose 'One Model for All', a universal pre-training framework for EEG analysis across disparate datasets. Our paradigm decouples learning into two stages: (1) Univariate pre-training via self-supervised contrastive learning on individual channels, enabled by a Unified Channel Schema (UCS) that leverages the channel union (e.g., SEED-62ch, DEAP-32ch); (2) Multivariate fine-tuning with a novel 'ART' (Adaptive Resampling Transformer) and 'GAT' (Graph Attention Network) architecture to capture complex spatio-temporal dependencies. Experiments show universal pre-training is an essential stabilizer, preventing collapse on SEED (vs. scratch) and yielding substantial gains on DEAP (+7.65%) and DREAMER (+3.55%). Our framework achieves new SOTA performance on all within-subject benchmarks: SEED (99.27%), DEAP (93.69%), and DREAMER (93.93%). We also show SOTA cross-dataset transfer, achieving 94.08% (intersection) and 93.05% (UCS) on the unseen DREAMER dataset, with the former surpassing the within-domain pre-training benchmark. Ablation studies validate our architecture: the GAT module is critical, yielding a +22.19% gain over GCN on the high-noise DEAP dataset, and its removal causes a catastrophic -16.44% performance drop. This work paves the way for more universal, scalable, and effective pre-trained models for diverse EEG analysis tasks.

Related papers

AQCat25: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis [0.0]
We introduce AQCat25, a complementary dataset of 13.5 million density functional theory (DFT) single point calculations.<n>We investigate methodologies for integrating new datasets, such as AQCat25, with the broader Open Catalyst 2020 (OC20) dataset.<n>We show that explicitly conditioning the model on this system-specific metadata, for example by using Feature-wise Linear Modulation (FiLM), successfully addresses this challenge.
arXiv Detail & Related papers (2025-10-27T02:47:20Z)
Multi-dataset Joint Pre-training of Emotional EEG Enables Generalizable Affective Computing [5.116264249622881]
Existing EEG models struggle with complex tasks like emotion recognition due to mismatches between task-specific features and broad pre-training approaches.<n>This work aims to develop a task-specific multi-dataset joint pre-training framework for cross-dataset emotion recognition.
arXiv Detail & Related papers (2025-10-25T07:30:24Z)
Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z)
A Novel Hybrid Deep Learning Technique for Speech Emotion Detection using Feature Engineering [0.0]
Our proposed DCRF-BiLSTM model is used to recognize seven emotions: neutral, happy, sad, angry, fear, disgust, and surprise.<n>The model achieves high accuracy on individual datasets, including 97.83% on RAVDESS.<n>For the combined (R+T+S) datasets, it achieves 98.82% accuracy, outperforming previously reported results.
arXiv Detail & Related papers (2025-07-09T17:07:45Z)
Is Architectural Complexity Overrated? Competitive and Interpretable Knowledge Graph Completion with RelatE [6.959701672059059]
RelatE is an interpretable and modular method that efficiently integrates dual representations for entities and relations.<n>It achieves competitive or superior performance on standard benchmarks.<n>Perturbation studies demonstrate improved robustness, with MRR reduced by up to 61% relative to TransE and by up to 19% compared to RotatE.
arXiv Detail & Related papers (2025-05-25T04:36:52Z)
AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection [0.0]
We propose a novel two-tier ensemble framework for deepfake detection based on deep learning.<n>Our framework employs a unique approach where each architecture is instantiated three times.<n>Experiments achieved state-of-the-art intra-dataset performance.
arXiv Detail & Related papers (2025-05-01T05:14:50Z)
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training [73.90260246781435]
We present Lory, the first approach that scales such architectures to autoregressive language model pre-training. We show significant performance gains over parameter-matched dense models on both perplexity and a variety of downstream tasks. Despite segment-level routing, Lory models achieve competitive performance compared to state-of-the-art MoE models with token-level routing.
arXiv Detail & Related papers (2024-05-06T03:06:33Z)
One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation [69.65734716679925]
Knowledge distillation has proven to be a highly effective approach for enhancing model performance through a teacher-student training scheme. Most existing distillation methods are designed under the assumption that the teacher and student models belong to the same model family. We propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.
arXiv Detail & Related papers (2023-10-30T11:13:02Z)
Scaling Data Generation in Vision-and-Language Navigation [116.95534559103788]
We propose an effective paradigm for generating large-scale data for learning. We apply 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs. Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning.
arXiv Detail & Related papers (2023-07-28T16:03:28Z)
ContrasInver: Ultra-Sparse Label Semi-supervised Regression for Multi-dimensional Seismic Inversion [7.356328937024184]
ContrasInver is a method that achieves seismic inversion using as few as two or three well logs. In experiments, ContrasInver achieved state-of-the-art performance in the synthetic data SEAM I. It's the first data-driven approach yielding reliable results on the Netherlands F3 and Delft, using only three and two well logs respectively.
arXiv Detail & Related papers (2023-02-13T15:19:51Z)
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains. We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images. A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID) Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance. This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.