From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Visual Concepts in Brain Signal Analysis
- URL: http://arxiv.org/abs/2507.03633v4
- Date: Fri, 11 Jul 2025 18:45:36 GMT
- Title: From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Visual Concepts in Brain Signal Analysis
- Authors: Amirabbas Hojjati, Lu Li, Ibrahim Hameed, Anis Yazidi, Pedro G. Lind, Rabindra Khadka,
- Abstract summary: EEG signals capture brain activity with high temporal and low spatial resolution, supporting applications such as neurological diagnosis, cognitive monitoring, and brain-computer interfaces.<n>We propose EEG-VJEPA, a novel adaptation of the Video Joint Embedding Predictive Architecture (VJEPA) for EEG classification.<n>By treating EEG as video-like sequences, EEG-VJEPA learns semantically meaningful representations using joint embeddings and adaptive masking.
- Score: 7.218572835925958
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: EEG signals capture brain activity with high temporal and low spatial resolution, supporting applications such as neurological diagnosis, cognitive monitoring, and brain-computer interfaces. However, effective analysis is hindered by limited labeled data, high dimensionality, and the absence of scalable models that fully capture spatiotemporal dependencies. Existing self-supervised learning (SSL) methods often focus on either spatial or temporal features, leading to suboptimal representations. To this end, we propose EEG-VJEPA, a novel adaptation of the Video Joint Embedding Predictive Architecture (V-JEPA) for EEG classification. By treating EEG as video-like sequences, EEG-VJEPA learns semantically meaningful spatiotemporal representations using joint embeddings and adaptive masking. To our knowledge, this is the first work that exploits V-JEPA for EEG classification and explores the visual concepts learned by the model. Evaluations on the publicly available Temple University Hospital (TUH) Abnormal EEG dataset show that EEG-VJEPA outperforms existing state-of-the-art models in classification accuracy. Beyond classification accuracy, EEG-VJEPA captures physiologically relevant spatial and temporal signal patterns, offering interpretable embeddings that may support human-AI collaboration in diagnostic workflows. These findings position EEG-VJEPA as a promising framework for scalable, trustworthy EEG analysis in real-world clinical settings.
Related papers
- BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals [50.76802709706976]
This paper proposes Brain Omni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings.<n>To unify diverse data sources, we introduce BrainTokenizer, the first tokenizer that quantises neural brain activity into discrete representations.<n>A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining.
arXiv Detail & Related papers (2025-05-18T14:07:14Z) - BioSerenity-E1: a self-supervised EEG model for medical applications [0.0]
BioSerenity-E1 is a family of self-supervised foundation models for clinical EEG applications.<n>It combines spectral tokenization with masked prediction to achieve state-of-the-art performance across relevant diagnostic tasks.
arXiv Detail & Related papers (2025-03-13T13:42:46Z) - Large Cognition Model: Towards Pretrained EEG Foundation Model [0.0]
We propose a transformer-based foundation model designed to generalize across diverse EEG datasets and downstream tasks.<n>Our findings highlight the potential of pretrained EEG foundation models to accelerate advancements in neuroscience, personalized medicine, and BCI technology.
arXiv Detail & Related papers (2025-02-11T04:28:10Z) - CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.<n>Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.<n>The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z) - FoME: A Foundation Model for EEG using Adaptive Temporal-Lateral Attention Scaling [19.85701025524892]
FoME (Foundation Model for EEG) is a novel approach using adaptive temporal-lateral attention scaling.
FoME is pre-trained on a diverse 1.7TB dataset of scalp and intracranial EEG recordings, comprising 745M parameters trained for 1,096k steps.
arXiv Detail & Related papers (2024-09-19T04:22:40Z) - REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates [54.96885726053036]
This paper introduces a novel graph-based residual state update mechanism (REST) for real-time EEG signal analysis.
By leveraging a combination of graph neural networks and recurrent structures, REST efficiently captures both non-Euclidean geometry and temporal dependencies within EEG data.
Our model demonstrates high accuracy in both seizure detection and classification tasks.
arXiv Detail & Related papers (2024-06-03T16:30:19Z) - Joint Contrastive Learning with Feature Alignment for Cross-Corpus EEG-based Emotion Recognition [2.1645626994550664]
We propose a novel Joint Contrastive learning framework with Feature Alignment to address cross-corpus EEG-based emotion recognition.
In the pre-training stage, a joint domain contrastive learning strategy is introduced to characterize generalizable time-frequency representations of EEG signals.
In the fine-tuning stage, JCFA is refined in conjunction with downstream tasks, where the structural connections among brain electrodes are considered.
arXiv Detail & Related papers (2024-04-15T08:21:17Z) - DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial
Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment.
Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images.
This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z) - fMRI from EEG is only Deep Learning away: the use of interpretable DL to
unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data.
We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Machine-Learning-Based Diagnostics of EEG Pathology [29.98686945159869]
We develop a feature-based EEG analysis framework and compare it to state-of-the-art end-to-end methods.
We find accuracies across both approaches in an astonishingly narrow range from 81--86%.
We argue that the accuracies of current binary EEG pathology decoders could saturate near 90% due to the imperfect inter-rater agreement of the clinical labels.
arXiv Detail & Related papers (2020-02-11T17:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.