Related papers: Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces

Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces

URL: http://arxiv.org/abs/2402.18546v3
Date: Tue, 19 Mar 2024 21:54:05 GMT
Title: Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces
Authors: Geeling Chau, Yujin An, Ahamed Raffey Iqbal, Soon-Jo Chung, Yisong Yue, Sabera Talukder,
Abstract summary: A major goal in neuroscience is to discover neural data representations that generalize. Recent work has begun to address generalization across sessions and subjects, but few study robustness to sensor failure. We first collect our own electroencephalography dataset with numerous sessions, subjects, and sensors, then study two time series models: EEGNet and TOTEM.
Score: 24.52935957415906
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A major goal in neuroscience is to discover neural data representations that generalize. This goal is challenged by variability along recording sessions (e.g. environment), subjects (e.g. varying neural structures), and sensors (e.g. sensor noise), among others. Recent work has begun to address generalization across sessions and subjects, but few study robustness to sensor failure which is highly prevalent in neuroscience experiments. In order to address these generalizability dimensions we first collect our own electroencephalography dataset with numerous sessions, subjects, and sensors, then study two time series models: EEGNet (Lawhern et al., 2018) and TOTEM (Talukder et al., 2024). EEGNet is a widely used convolutional neural network, while TOTEM is a discrete time series tokenizer and transformer model. We find that TOTEM outperforms or matches EEGNet across all generalizability cases. Finally through analysis of TOTEM's latent codebook we observe that tokenization enables generalization.

Related papers

CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding [57.90382885533593]
We propose a Cross-scale Spatiotemporal Brain foundation model for generalized decoding EEG signals.<n>We show that CSBrain consistently outperforms task-specific and foundation model baselines.<n>These results establish cross-scale modeling as a key inductive bias and position CSBrain as a robust backbone for future brain-AI research.
arXiv Detail & Related papers (2025-06-29T03:29:34Z)
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals [50.76802709706976]
This paper proposes Brain Omni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings.<n>To unify diverse data sources, we introduce BrainTokenizer, the first tokenizer that quantises neural brain activity into discrete representations.<n>A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining.
arXiv Detail & Related papers (2025-05-18T14:07:14Z)
Neural decoding from stereotactic EEG: accounting for electrode variability across subjects [21.28778005847666]
We introduce seegnificant: a training framework that can be used to decode behavior across subjects using sEEG data. We construct a multi-subject model trained on the combined data from 21 subjects performing a behavioral task.
arXiv Detail & Related papers (2024-11-01T17:58:01Z)
Artificial Kuramoto Oscillatory Neurons [65.16453738828672]
It has long been known in both neuroscience and AI that ''binding'' between neurons leads to a form of competitive learning. We introduce Artificial rethinking together with arbitrary connectivity designs such as fully connected convolutional, or attentive mechanisms. We show that this idea provides performance improvements across a wide spectrum of tasks such as unsupervised object discovery, adversarial robustness, uncertainty, and reasoning.
arXiv Detail & Related papers (2024-10-17T17:47:54Z)
Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size. Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM. Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z)
Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI [6.926908480247951]
We propose a unified foundation model for EEG called Large Brain Model (LaBraM) LaBraM enables cross-dataset learning by segmenting the EEG signals into EEG channel patches. We then pre-train neural Transformers by predicting the original neural codes for the masked EEG channel patches.
arXiv Detail & Related papers (2024-05-29T05:08:16Z)
Artificial Neural Networks-based Real-time Classification of ENG Signals for Implanted Nerve Interfaces [7.335832236913667]
We explore four types of artificial neural networks (ANNs) to extract sensory stimuli from the electroneurographic (ENG) signal measured in the sciatic nerve of rats. Different sizes of the data sets are considered to analyze the feasibility of the investigated ANNs for real-time classification. Our results show that some ANNs are more suitable for real-time applications, being capable of achieving accuracies over $90%$ for signal windows of $100$ and $200,$ms with a low enough processing time to be effective for pathology recovery.
arXiv Detail & Related papers (2024-03-29T15:23:30Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
EEG-ITNet: An Explainable Inception Temporal Convolutional Network for Motor Imagery Classification [0.5616884466478884]
We propose an end-to-end deep learning architecture called EEG-ITNet. Our model can extract rich spectral, spatial, and temporal information from multi-channel EEG signals. EEG-ITNet shows up to 5.9% improvement in the classification accuracy in different scenarios.
arXiv Detail & Related papers (2022-04-14T13:18:43Z)
Transformer-based Spatial-Temporal Feature Learning for EEG Decoding [4.8276709243429]
We propose a novel EEG decoding method that mainly relies on the attention mechanism. We have reached the level of the state-of-the-art in multi-classification of EEG, with fewer parameters. It has good potential to promote the practicality of brain-computer interface (BCI)
arXiv Detail & Related papers (2021-06-11T00:48:18Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. In this paper, we present and evaluate different strategies for the binarization of graph neural networks. We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks. ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.