Related papers: Large EEG-U-Transformer for Time-Step Level Detection Without Pre-Training

Large EEG-U-Transformer for Time-Step Level Detection Without Pre-Training

URL: http://arxiv.org/abs/2504.00336v3
Date: Sat, 04 Oct 2025 00:33:48 GMT
Title: Large EEG-U-Transformer for Time-Step Level Detection Without Pre-Training
Authors: Kerui Wu, Ziyue Zhao, Bülent Yener,
Abstract summary: We propose a simple U-shaped model to efficiently learn representations by capturing both local and global features.<n>Compared to other window-level classification models, our method directly outputs predictions at the time-step level.<n>Our model won 1st place in the 2025 "seizure detection challenge" organized in the International Conference on Artificial Intelligence in Epilepsy and Other Neurological Disorders.
Score: 1.3254304182988286
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Electroencephalography (EEG) reflects the brain's functional state, making it a crucial tool for diverse detection applications like seizure detection and sleep stage classification. While deep learning-based approaches have recently shown promise for automated detection, traditional models are often constrained by limited learnable parameters and only achieve modest performance. In contrast, large foundation models showed improved capabilities by scaling up the model size, but required extensive time-consuming pre-training. Moreover, both types of existing methods require complex and redundant post-processing pipelines to convert discrete labels to continuous annotations. In this work, based on the multi-scale nature of EEG events, we propose a simple U-shaped model to efficiently learn representations by capturing both local and global features using convolution and self-attentive modules for sequence-to-sequence modeling. Compared to other window-level classification models, our method directly outputs predictions at the time-step level, eliminating redundant overlapping inferences. Beyond sequence-to-sequence modeling, the architecture naturally extends to window-level classification by incorporating an attention-pooling layer. Such a paradigm shift and model design demonstrated promising efficiency improvement, cross-subject generalization, and state-of-the-art performance in various time-step and window-level classification tasks in the experiment. More impressively, our model showed the capability to be scaled up to the same level as existing large foundation models that have been extensively pre-trained over diverse datasets and outperforms them by solely using the downstream fine-tuning dataset. Our model won 1st place in the 2025 "seizure detection challenge" organized in the International Conference on Artificial Intelligence in Epilepsy and Other Neurological Disorders.

Related papers

Model Inversion with Layer-Specific Modeling and Alignment for Data-Free Continual Learning [19.12792297140574]
Continual learning aims to incrementally train a model on a sequence of tasks while retaining performance on prior ones.<n> storing and replaying data is often infeasible due to privacy or security constraints.<n>We propose Per-layer Model Inversion (PMI), inspired by faster convergence in single-layer optimization.
arXiv Detail & Related papers (2025-10-30T09:58:48Z)
Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services [10.421371572062595]
This study proposes an anomaly detection method based on the Transformer architecture with integrated multiscale feature perception.<n>The proposed method outperforms mainstream baseline models in key metrics, including precision, recall, AUC, and F1-score.
arXiv Detail & Related papers (2025-08-20T07:52:36Z)
EEG-Based Inter-Patient Epileptic Seizure Detection Combining Domain Adversarial Training with CNN-BiLSTM Network [1.9662978733004604]
We propose a detection framework combining domain adversarial training with a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM)<n> evaluation using EEG recordings from 20 patients with focal epilepsy demonstrated superior performance over non-adversarial methods.<n>The integration of adversarial training with temporal modeling enables robust cross-patient seizure detection.
arXiv Detail & Related papers (2025-05-21T07:27:55Z)
The use of Multi-domain Electroencephalogram Representations in the building of Models based on Convolutional and Recurrent Neural Networks for Epilepsy Detection [1.4785447770765987]
Epilepsy affects approximately 50 million people globally and remains challenging to treat. EEG data is prone to variability between experts, emphasizing the need for automated solutions. This work systematically compares deep neural networks trained on EEG data in time, frequency, and time-frequency domains. Results demonstrate that frequency-domain data achieves detection metrics exceeding 97%, providing a robust foundation for more accurate and reliable seizure detection systems.
arXiv Detail & Related papers (2025-04-24T19:50:48Z)
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications.<n>Current state-of-the-art methods focus on training innovative architectural designs on confined datasets.<n>We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z)
Self-Supervised Radio Pre-training: Toward Foundational Models for Spectrogram Learning [6.1339395157466425]
Foundational deep learning (DL) models are general models, trained on diverse, diverse, and unlabelled datasets. We introduce Masked Spectrogram Modeling, a novel self-supervised learning approach for pretraining foundational DL models on radio signals.
arXiv Detail & Related papers (2024-11-14T23:56:57Z)
SincVAE: a New Approach to Improve Anomaly Detection on EEG Data Using SincNet and Variational Autoencoder [0.0]
This work proposes a semi-supervised approach for detecting epileptic seizures from EEG data, utilizing a novel Deep Learning-based method called SincVAE. Results indicate that SincVAE improves seizure detection in EEG data and is capable of identifying early seizures during the preictal stage as well as monitoring patients throughout the postictal stage.
arXiv Detail & Related papers (2024-06-25T13:21:01Z)
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates [54.96885726053036]
This paper introduces a novel graph-based residual state update mechanism (REST) for real-time EEG signal analysis. By leveraging a combination of graph neural networks and recurrent structures, REST efficiently captures both non-Euclidean geometry and temporal dependencies within EEG data. Our model demonstrates high accuracy in both seizure detection and classification tasks.
arXiv Detail & Related papers (2024-06-03T16:30:19Z)
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling [4.190836962132713]
This paper introduces Orchid, a novel architecture designed to address the quadratic complexity of traditional attention mechanisms. At the core of this architecture lies a new data-dependent global convolution layer, which contextually adapts its conditioned kernel on input sequence. We evaluate the proposed model across multiple domains, including language modeling and image classification, to highlight its performance and generality.
arXiv Detail & Related papers (2024-02-28T17:36:45Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence [56.092059713922744]
We show that using high-level contextualized features as prediction targets can achieve superior performance. Specifically, we propose Skeleton2vec, a simple and efficient self-supervised 3D action representation learning framework. Our proposed Skeleton2vec outperforms previous methods and achieves state-of-the-art results.
arXiv Detail & Related papers (2024-01-01T12:08:35Z)
VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG [8.100646331930953]
An accurate and efficient epileptic seizure onset detection can significantly benefit patients. Traditional diagnostic methods, primarily relying on electroencephalograms (EEGs), often result in cumbersome and non-portable solutions. We propose a novel Video-based Seizure detection model via a skeleton-basedtemporal Vision Graph neural network.
arXiv Detail & Related papers (2023-11-24T15:07:29Z)
Lightweight Convolution Transformer for Cross-patient Seizure Detection in Multi-channel EEG Signals [0.0]
This study proposes a novel deep learning architecture based lightweight convolution transformer (LCT) The transformer is able to learn spatial and temporal correlated information simultaneously from the multi-channel electroencephalogram (EEG) signal to detect seizures at smaller segment lengths.
arXiv Detail & Related papers (2023-05-07T16:43:52Z)
Unsupervised Multivariate Time-Series Transformers for Seizure Identification on EEG [9.338549413542948]
Epileptic seizures are commonly monitored through electroencephalogram (EEG) recordings. We present an unsupervised transformer-based model for seizure identification on raw EEG. We train an autoencoder involving a transformer encoder via an unsupervised loss function, incorporating a novel masking strategy.
arXiv Detail & Related papers (2023-01-03T15:57:13Z)
Are we certain it's anomalous? [57.729669157989235]
Anomaly detection in time series is a complex task since anomalies are rare due to highly non-linear temporal correlations. Here we propose the novel use of Hyperbolic uncertainty for Anomaly Detection (HypAD) HypAD learns self-supervisedly to reconstruct the input signal.
arXiv Detail & Related papers (2022-11-16T21:31:39Z)
A Robust and Explainable Data-Driven Anomaly Detection Approach For Power Electronics [56.86150790999639]
We present two anomaly detection and classification approaches, namely the Matrix Profile algorithm and anomaly transformer. The Matrix Profile algorithm is shown to be well suited as a generalizable approach for detecting real-time anomalies in streaming time-series data. A series of custom filters is created and added to the detector to tune its sensitivity, recall, and detection accuracy.
arXiv Detail & Related papers (2022-09-23T06:09:35Z)
Task-oriented Self-supervised Learning for Anomaly Detection in Electroencephalography [51.45515911920534]
A task-oriented self-supervised learning approach is proposed to train a more effective anomaly detector. A specific two branch convolutional neural network with larger kernels is designed as the feature extractor. The effectively designed and trained feature extractor has shown to be able to extract better feature representations from EEGs.
arXiv Detail & Related papers (2022-07-04T13:15:08Z)
StRegA: Unsupervised Anomaly Detection in Brain MRIs using a Compact Context-encoding Variational Autoencoder [48.2010192865749]
Unsupervised anomaly detection (UAD) can learn a data distribution from an unlabelled dataset of healthy subjects and then be applied to detect out of distribution samples. This research proposes a compact version of the "context-encoding" VAE (ceVAE) model, combined with pre and post-processing steps, creating a UAD pipeline (StRegA) The proposed pipeline achieved a Dice score of 0.642$pm$0.101 while detecting tumours in T2w images of the BraTS dataset and 0.859$pm$0.112 while detecting artificially induced anomalies.
arXiv Detail & Related papers (2022-01-31T14:27:35Z)
Adaptive Memory Networks with Self-supervised Learning for Unsupervised Anomaly Detection [54.76993389109327]
Unsupervised anomaly detection aims to build models to detect unseen anomalies by only training on the normal data. We propose a novel approach called Adaptive Memory Network with Self-supervised Learning (AMSL) to address these challenges. AMSL incorporates a self-supervised learning module to learn general normal patterns and an adaptive memory fusion module to learn rich feature representations.
arXiv Detail & Related papers (2022-01-03T03:40:21Z)
Multi-Centroid Hyperdimensional Computing Approach for Epileptic Seizure Detection [4.249341912358848]
We propose a novel semi-supervised learning approach based on a multi-centroid HD computing. The multi-centroid approach allows to have several prototype vectors representing seizure and non-seizure states. Up to 14% improvement is achieved on an unbalanced test set with 10 times more non-seizure than seizure data.
arXiv Detail & Related papers (2021-11-16T13:30:47Z)
SOUL: An Energy-Efficient Unsupervised Online Learning Seizure Detection Classifier [68.8204255655161]
Implantable devices that record neural activity and detect seizures have been adopted to issue warnings or trigger neurostimulation to suppress seizures. For an implantable seizure detection system, a low power, at-the-edge, online learning algorithm can be employed to dynamically adapt to neural signal drifts. SOUL was fabricated in TSMC's 28 nm process occupying 0.1 mm2 and achieves 1.5 nJ/classification energy efficiency, which is at least 24x more efficient than state-of-the-art.
arXiv Detail & Related papers (2021-10-01T23:01:20Z)
STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data. Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z)
A multi-stage machine learning model on diagnosis of esophageal manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z)
Deep Neural Dynamic Bayesian Networks applied to EEG sleep spindles modeling [0.0]
We propose a generative model for single-channel EEG that incorporates the constraints experts actively enforce during visual scoring. We derive algorithms for exact, tractable inference as a special case of Generalized Expectation Maximization. We validate the model on three public datasets and provide support that more complex models are able to surpass state-of-the-art detectors.
arXiv Detail & Related papers (2020-10-16T21:48:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.