Related papers: EEG Foundation Models: Progresses, Benchmarking, and Open Problems

EEG Foundation Models: Progresses, Benchmarking, and Open Problems

URL: http://arxiv.org/abs/2601.17883v1
Date: Sun, 25 Jan 2026 15:28:50 GMT
Title: EEG Foundation Models: Progresses, Benchmarking, and Open Problems
Authors: Dingkun Liu, Yuheng Chen, Zhu Chen, Zhenyao Cui, Yaozhi Wen, Jiayu An, Jingwei Luo, Dongrui Wu,
Abstract summary: We review 50 representative EEG foundation models and organize their design choices into a unified taxonomic framework.<n>We evaluate 12 open-source foundation models and competitive specialist baselines across 13 EEG datasets spanning nine BCI paradigms.
Score: 10.447009984769819
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Electroencephalography (EEG) foundation models have recently emerged as a promising paradigm for brain-computer interfaces (BCIs), aiming to learn transferable neural representations from large-scale heterogeneous recordings. Despite rapid progresses, there lacks fair and comprehensive comparisons of existing EEG foundation models, due to inconsistent pre-training objectives, preprocessing choices, and downstream evaluation protocols. This paper fills this gap. We first review 50 representative models and organize their design choices into a unified taxonomic framework including data standardization, model architectures, and self-supervised pre-training strategies. We then evaluate 12 open-source foundation models and competitive specialist baselines across 13 EEG datasets spanning nine BCI paradigms. Emphasizing real-world deployments, we consider both cross-subject generalization under a leave-one-subject-out protocol and rapid calibration under a within-subject few-shot setting. We further compare full-parameter fine-tuning with linear probing to assess the transferability of pre-trained representations, and examine the relationship between model scale and downstream performance. Our results indicate that: 1) linear probing is frequently insufficient; 2) specialist models trained from scratch remain competitive across many tasks; and, 3) larger foundation models do not necessarily yield better generalization performance under current data regimes and training practices.

Related papers

Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation [61.248535801314375]
Subset-Selected Counterfactual Augmentation (SS-CA)<n>We develop Counterfactual LIMA to identify minimal spatial region sets whose removal can selectively alter model predictions.<n>Experiments show that SS-CA improves generalization on in-distribution (ID) test data and achieves superior performance on out-of-distribution (OOD) benchmarks.
arXiv Detail & Related papers (2025-11-15T08:39:22Z)
NeuroTTT: Bridging Pretraining-Downstream Task Misalignment in EEG Foundation Models via Test-Time Training [6.030518150035875]
This paper introduces a two-stage alignment strategy for EEG foundation models.<n>First, we propose NeuroTTT: a domain-specific self-supervised fine-tuning paradigm.<n>Second, we perform self-supervised test-time training on individual unlabeled test samples.<n>Our approach is the first to unify domain-tuned self-supervision with test-time training in large-scale EEG foundation models.
arXiv Detail & Related papers (2025-09-30T14:14:46Z)
Sequential Data Augmentation for Generative Recommendation [54.765568804267645]
Generative recommendation plays a crucial role in personalized systems, predicting users' future interactions from their historical behavior sequences.<n>Data augmentation, the process of constructing training data from user interaction histories, is a critical yet underexplored factor in training these models.<n>We propose GenPAS, a principled framework that models augmentation as a sampling process and enables flexible control of the resulting training distribution.<n>Our experiments on benchmark and industrial datasets demonstrate that GenPAS yields superior accuracy, data efficiency, and parameter efficiency compared to existing strategies.
arXiv Detail & Related papers (2025-09-17T02:53:25Z)
EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models [16.433809341013113]
EEG-FM-Bench is the first comprehensive benchmark for the systematic and standardized evaluation of EEG foundation models (EEG-FMs)<n>Our contributions are threefold: (1) we curate a diverse suite of downstream tasks and datasets from canonical EEG paradigms, implementing standardized processing and evaluation protocols within a unified open-source framework; (2) we benchmark prominent state-of-the-art foundation models to establish comprehensive baseline results for a clear comparison of the current landscape; (3) we perform qualitative analyses to provide insights into model behavior and inform future architectural design.
arXiv Detail & Related papers (2025-08-25T07:34:33Z)
Analysis of Transferability Estimation Metrics for Surgical Phase Recognition [3.3285108719932555]
Fine-tuning pre-trained models has become a cornerstone of modern machine learning, allowing practitioners to achieve high performance with limited labeled data.<n>In surgical video analysis, where expert annotations are especially time-consuming and costly, identifying the most suitable pre-trained model for a downstream task is both critical and challenging.<n>We provide the first comprehensive benchmark of three representative metrics, LogME, H-Score, and TransRate, on two diverse datasets.
arXiv Detail & Related papers (2025-08-22T18:05:33Z)
A Large-scale Benchmark on Geological Fault Delineation Models: Domain Shift, Training Dynamics, Generalizability, Evaluation and Inferential Behavior [11.859145373647474]
We present the first large-scale benchmarking study designed to provide guidelines for domain shift strategies in seismic interpretation.<n>Our benchmark spans over 200 combinations of model architectures, datasets and training strategies, across three datasets.<n>Our analysis shows that common fine-tuning practices can lead to catastrophic forgetting when source and target datasets are disjoint.
arXiv Detail & Related papers (2025-05-13T13:56:43Z)
Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders [13.474737752636608]
We present the largest comprehensive analysis to-date of how the upstream pre-training factors and downstream performance of CLIP models relate to intrinsic biases.<n>We study 131 unique CLIP models, trained on 26 datasets, using 55 architectures, and in a variety of sizes.<n>We find that the choice of pre-training dataset is the most significant upstream predictor of bias, whereas architectural variations have minimal impact.
arXiv Detail & Related papers (2025-02-11T21:11:47Z)
Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training [68.94373533768501]
We model knowledge retention, the capacity of a pre-trained language model to memorize factual information from its corpus, and introduce a principled method to estimate it prior to training.<n>We propose Size-dependent Mutual Information (SMI), an information-theoretic predictor that integrates knowledge frequency, knowledge specificity, and model size to forecast closed-book question answering (QA) accuracy.
arXiv Detail & Related papers (2025-02-06T13:23:53Z)
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications.<n>Current state-of-the-art methods focus on training innovative architectural designs on confined datasets.<n>We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z)
Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models. We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models. Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z)
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks. We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework. TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.