CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model
- URL: http://arxiv.org/abs/2506.09110v1
- Date: Tue, 10 Jun 2025 17:20:39 GMT
- Title: CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model
- Authors: Jingying Ma, Feng Wu, Qika Lin, Yucheng Xing, Chenyu Liu, Ziyu Jia, Mengling Feng,
- Abstract summary: EEG foundation models struggle with limited heterogeneous representation capacity and inefficiency in capturing multi-scale brain dependencies.<n>We propose CodeBrain, an efficient EFM structurally aligned with brain organization, trained in two stages.<n>EEGSSM combines a structured global convolution architecture and a sliding window attention mechanism to jointly model sparse long-range and local dependencies.
- Score: 33.550819280074826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Electroencephalography (EEG) provides real-time insights into brain activity and is widely used in neuroscience. However, variations in channel configurations, sequence lengths, and task objectives limit the transferability of traditional task-specific models. Although recent EEG foundation models (EFMs) aim to learn generalizable representations, they struggle with limited heterogeneous representation capacity and inefficiency in capturing multi-scale brain dependencies. To address these challenges, we propose CodeBrain, an efficient EFM structurally aligned with brain organization, trained in two stages. (1) We introduce a TFDual-Tokenizer that independently tokenizes heterogeneous temporal and frequency components, enabling a quadratic expansion of the discrete representation space. This also offers a degree of interpretability through cross-domain token analysis. (2) We propose the EEGSSM, which combines a structured global convolution architecture and a sliding window attention mechanism to jointly model sparse long-range and local dependencies. Unlike fully connected Transformer models, EEGSSM better reflects the brain's small-world topology and efficiently captures EEG's inherent multi-scale structure. EEGSSM is trained with a masked self-supervised learning objective to predict token indices obtained in TFDual-Tokenizer. Comprehensive experiments on 10 public EEG datasets demonstrate the generalizability of CodeBrain with linear probing. By offering biologically informed and interpretable EEG modeling, CodeBrain lays the foundation for future neuroscience research. Both code and pretraining weights will be released in the future version.
Related papers
- Brain-OF: An Omnifunctional Foundation Model for fMRI, EEG and MEG [2.783700146328046]
We propose Brain-OF, the first brain foundation model jointly pretrained on fMRI, MEG and EEG inputs within a unified framework.<n>Brain-OF is pretrained on a large-scale corpus comprising around 40 datasets and demonstrates superior performance across diverse downstream tasks.
arXiv Detail & Related papers (2026-02-26T15:47:13Z) - One Brain, Omni Modalities: Towards Unified Non-Invasive Brain Decoding with Large Language Models [42.83819917665563]
We introduce textbfNOBEL, a textbfneuro-textbfomni-modal textbfbrain-textbfencoding textbflarge language model (LLM)<n>Our architecture integrates a unified encoder for EEG and MEG with a novel dual-path strategy for fMRI, aligning non-invasive brain signals and external sensory stimuli into a shared token space.
arXiv Detail & Related papers (2026-02-25T03:24:54Z) - Brain4FMs: A Benchmark of Foundation Models for Electrical Brain Signal [7.208815613117472]
Brain Foundation Models (BFMs) are transforming neuroscience by enabling scalable and transferable learning from neural signals.<n>We introduce Brain4FMs, an open evaluation platform with plug-and-play interfaces that integrates 15 representative BFMs and 18 public datasets.
arXiv Detail & Related papers (2026-02-12T04:25:39Z) - DeeperBrain: A Neuro-Grounded EEG Foundation Model Towards Universal BCI [23.430788212164686]
DeeperBrain is a neuro-grounded foundation model that integrates domain-specific inductive biases into its model design and learning objectives.<n>It achieves state-of-the-art or highly competitive performance under end-to-end fine-tuning.<n> DeeperBrain maintains superior efficacy under a rigorous frozen-probing protocol.
arXiv Detail & Related papers (2026-01-05T05:31:45Z) - Brain-Gen: Towards Interpreting Neural Signals for Stimulus Reconstruction Using Transformers and Latent Diffusion Models [1.479639149658596]
We propose a framework to extract spatial-temporal representations associated with observed visual stimuli from EEG recordings.<n>Our work marks a significant step towards generalizable semantic interpretation of the EEG signals.
arXiv Detail & Related papers (2025-12-21T18:20:21Z) - NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models [66.91449452840318]
We introduce NeuroRVQ, a scalable Large Brainwave Model (LBM) centered on a codebook-based tokenizer.<n>Our tokenizer integrates: (i) multi-scale feature extraction modules that capture the full frequency neural spectrum; (ii) hierarchical residual vector quantization (RVQ) codebooks for high-resolution encoding; and, (iii) an EEG signal phase- and amplitude-aware loss function for efficient training.<n>Our empirical results demonstrate that NeuroRVQ achieves lower reconstruction error and outperforms existing LBMs on a variety of downstream tasks.
arXiv Detail & Related papers (2025-10-15T01:26:52Z) - WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities [55.00677513249723]
EEG signals simultaneously encode both cognitive processes and intrinsic neural states.<n>We map EEG signals and their corresponding modalities into a unified semantic space to achieve generalized interpretation.<n>The resulting model demonstrates robust classification accuracy while supporting flexible, open-ended conversations.
arXiv Detail & Related papers (2025-09-26T06:21:51Z) - CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding [57.90382885533593]
We propose a Cross-scale Spatiotemporal Brain foundation model for generalized decoding EEG signals.<n>We show that CSBrain consistently outperforms task-specific and foundation model baselines.<n>These results establish cross-scale modeling as a key inductive bias and position CSBrain as a robust backbone for future brain-AI research.
arXiv Detail & Related papers (2025-06-29T03:29:34Z) - BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals [50.76802709706976]
This paper proposes Brain Omni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings.<n>To unify diverse data sources, we introduce BrainTokenizer, the first tokenizer that quantises neural brain activity into discrete representations.<n>A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining.
arXiv Detail & Related papers (2025-05-18T14:07:14Z) - Large Cognition Model: Towards Pretrained EEG Foundation Model [0.0]
We propose a transformer-based foundation model designed to generalize across diverse EEG datasets and downstream tasks.<n>Our findings highlight the potential of pretrained EEG foundation models to accelerate advancements in neuroscience, personalized medicine, and BCI technology.
arXiv Detail & Related papers (2025-02-11T04:28:10Z) - NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals [21.363722751437066]
We propose NeuroLM, the first multi-task foundation model that leverages the capabilities of Large Language Models (LLMs) by regarding EEG signals as a foreign language.<n>Our approach begins with learning a text-aligned neural tokenizer through vector-quantized temporal-frequency prediction, which encodes EEG signals into discrete neural tokens.<n>We are the first to demonstrate that, by specific incorporation with LLMs, NeuroLM unifies diverse EEG tasks within a single model through instruction tuning.
arXiv Detail & Related papers (2024-08-27T12:07:09Z) - MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer.
This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation.
Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation [65.65530016765615]
We propose a hierarchical predictive coding framework that captures multi-scale dependencies through three complementary learning objectives.<n> TokenUnify integrates random token prediction, next-token prediction, and next-all token prediction to create a comprehensive representational space.<n>We also introduce a large-scale EM dataset with 1.2 billion annotated voxels, offering ideal long-sequence visual data with spatial continuity.
arXiv Detail & Related papers (2024-05-27T05:45:51Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - EEG-Deformer: A Dense Convolutional Transformer for Brain-computer Interfaces [17.524441950422627]
We introduce EEG-Deformer, which incorporates two main novel components into a CNN-Transformer.
EEG-Deformer learns from neurophysiologically meaningful brain regions for the corresponding cognitive tasks.
arXiv Detail & Related papers (2024-04-25T18:00:46Z) - MindBridge: A Cross-Subject Brain Decoding Framework [60.58552697067837]
Brain decoding aims to reconstruct stimuli from acquired brain signals.
Currently, brain decoding is confined to a per-subject-per-model paradigm.
We present MindBridge, that achieves cross-subject brain decoding by employing only one model.
arXiv Detail & Related papers (2024-04-11T15:46:42Z) - A Knowledge-Driven Cross-view Contrastive Learning for EEG
Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels.
The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity.
By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z) - fMRI from EEG is only Deep Learning away: the use of interpretable DL to
unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data.
We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.