ALFEE: Adaptive Large Foundation Model for EEG Representation
- URL: http://arxiv.org/abs/2505.06291v1
- Date: Wed, 07 May 2025 13:32:31 GMT
- Title: ALFEE: Adaptive Large Foundation Model for EEG Representation
- Authors: Wei Xiong, Junming Lin, Jiangtong Li, Jie Li, Changjun Jiang,
- Abstract summary: We propose the Adaptive Large Foundation model for EEG signal representation(ALFEE) framework.<n>ALFEE is a novel hybrid transformer architecture with two learning stages for robust EEG representation learning.<n>After 25,000 hours of pretraining, extensive experimental results on six downstream EEG tasks demonstrate the superior performance of ALFEE over existing models.
- Score: 17.166788472910806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While foundation models excel in text, image, and video domains, the critical biological signals, particularly electroencephalography(EEG), remain underexplored. EEG benefits neurological research with its high temporal resolution, operational practicality, and safety profile. However, low signal-to-noise ratio, inter-subject variability, and cross-paradigm differences hinder the generalization of current models. Existing methods often employ simplified strategies, such as a single loss function or a channel-temporal joint representation module, and suffer from a domain gap between pretraining and evaluation tasks that compromises efficiency and adaptability. To address these limitations, we propose the Adaptive Large Foundation model for EEG signal representation(ALFEE) framework, a novel hybrid transformer architecture with two learning stages for robust EEG representation learning. ALFEE employs a hybrid attention that separates channel-wise feature aggregation from temporal dynamics modeling, enabling robust EEG representation with variable channel configurations. A channel encoder adaptively compresses variable channel information, a temporal encoder captures task-guided evolution, and a hybrid decoder reconstructs signals in both temporal and frequency domains. During pretraining, ALFEE optimizes task prediction, channel and temporal mask reconstruction, and temporal forecasting to enhance multi-scale and multi-channel representation. During fine-tuning, a full-model adaptation with a task-specific token dictionary and a cross-attention layer boosts performance across multiple tasks. After 25,000 hours of pretraining, extensive experimental results on six downstream EEG tasks demonstrate the superior performance of ALFEE over existing models. Our ALFEE framework establishes a scalable foundation for biological signal analysis with implementation at https://github.com/xw1216/ALFEE.
Related papers
- CRIA: A Cross-View Interaction and Instance-Adapted Pre-training Framework for Generalizable EEG Representations [52.251569042852815]
CRIA is an adaptive framework that utilizes variable-length and variable-channel coding to achieve a unified representation of EEG data across different datasets.<n>The model employs a cross-attention mechanism to fuse temporal, spectral, and spatial features effectively.<n> Experimental results on the Temple University EEG corpus and the CHB-MIT dataset show that CRIA outperforms existing methods with the same pre-training conditions.
arXiv Detail & Related papers (2025-06-19T06:31:08Z) - PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation [18.978031999678507]
A novel wavelet-based approach for physiological signal analysis is presented, aiming to capture multi-scale time-frequency features in various physiological signals.<n>Two large-scale pretrained models specific to EMG and ECG are introduced for the first time, achieving superior performance and setting new baselines in downstream tasks.<n>A unified multi-modal framework is constructed by integrating pretrained EEG model, where each modality is guided through its dedicated branch and fused via learnable weighted fusion.
arXiv Detail & Related papers (2025-06-12T05:11:41Z) - Multivariate Long-term Time Series Forecasting with Fourier Neural Filter [55.09326865401653]
We introduce FNF as the backbone and DBD as architecture to provide excellent learning capabilities and optimal learning pathways for spatial-temporal modeling.<n>We show that FNF unifies local time-domain and global frequency-domain information processing within a single backbone that extends naturally to spatial modeling.
arXiv Detail & Related papers (2025-06-10T18:40:20Z) - PSDNorm: Test-Time Temporal Normalization for Deep Learning on EEG Signals [63.05435596565677]
PSDNorm is a layer that leverages Monge mapping and temporal context to normalize feature maps in deep learning models.<n> PSDNorm achieves state-of-the-art performance at test time on datasets not seen during training.<n> PSDNorm provides a significant improvement in robustness, achieving markedly higher F1 scores for the 20% hardest subjects.
arXiv Detail & Related papers (2025-03-06T16:20:25Z) - BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN)<n>We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations.<n>Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z) - CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention [53.539020807256904]
We introduce a Compact for Representations of Brain Oscillations using alternating attention (CEReBrO)<n>Our tokenization scheme represents EEG signals at a per-channel patch.<n>We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention.
arXiv Detail & Related papers (2025-01-18T21:44:38Z) - FoME: A Foundation Model for EEG using Adaptive Temporal-Lateral Attention Scaling [19.85701025524892]
FoME (Foundation Model for EEG) is a novel approach using adaptive temporal-lateral attention scaling.
FoME is pre-trained on a diverse 1.7TB dataset of scalp and intracranial EEG recordings, comprising 745M parameters trained for 1,096k steps.
arXiv Detail & Related papers (2024-09-19T04:22:40Z) - Spatial Adaptation Layer: Interpretable Domain Adaptation For Biosignal Sensor Array Applications [0.7499722271664147]
We propose the Spatial Adaptation Layer (SAL), which can be applied to any biosignal array model.<n>We also introduce learnable baseline normalization (LBN) to reduce baseline fluctuations.<n>Tested on two HD-sEMG gesture recognition datasets, SAL and LBN outperformed standard fine-tuning on regular arrays.
arXiv Detail & Related papers (2024-09-12T14:06:12Z) - EEGMamba: Bidirectional State Space Model with Mixture of Experts for EEG Multi-task Classification [1.4004287903552533]
We introduce EEGMamba, the first universal EEG classification network to truly implement multi-task learning for EEG applications.
EEGMamba seamlessly integrates the Spatio-Temporal-Adaptive (ST- adaptive) module, bidirectional Mamba, and Mixture of Experts (MoE) into a unified framework.
We evaluate our model on eight publicly available EEG datasets, and the experimental results demonstrate its superior performance in four types of tasks.
arXiv Detail & Related papers (2024-07-20T11:15:47Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - DiffiT: Diffusion Vision Transformers for Image Generation [88.08529836125399]
Vision Transformer (ViT) has demonstrated strong modeling capabilities and scalability, especially for recognition tasks.
We study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT)
DiffiT is surprisingly effective in generating high-fidelity images with significantly better parameter efficiency.
arXiv Detail & Related papers (2023-12-04T18:57:01Z) - ESTformer: Transformer Utilizing Spatiotemporal Dependencies for Electroencaphalogram Super-resolution [13.037623259514323]
We develop an EEG framework that uses acquisition dependencies based on a transformer.<n>The Transformer for EEG SR tasks demonstrates the versatility of the Transformer for EEG SR tasks.<n>The superiority of the SR data was verified in an EEG-based person identification and emotion recognition task.
arXiv Detail & Related papers (2023-12-03T12:26:32Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - Learning Multiscale Consistency for Self-supervised Electron Microscopy
Instance Segmentation [48.267001230607306]
We propose a pretraining framework that enhances multiscale consistency in EM volumes.
Our approach leverages a Siamese network architecture, integrating strong and weak data augmentations.
It effectively captures voxel and feature consistency, showing promise for learning transferable representations for EM analysis.
arXiv Detail & Related papers (2023-08-19T05:49:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.