Related papers: ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG Data

ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG Data

URL: http://arxiv.org/abs/2308.00454v1
Date: Tue, 1 Aug 2023 11:10:33 GMT
Title: ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG Data
Authors: Ruiqi Yang, Eric Modesitt
Abstract summary: We demonstrate the application of a hybrid Vision Transformer (ViT) model, pretrained on ImageNet, on an electroencephalogram (EEG) regression task. This model shows a notable increase in performance compared to other models, including an identical architecture ViT trained without the ImageNet weights.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this study, we demonstrate the application of a hybrid Vision Transformer (ViT) model, pretrained on ImageNet, on an electroencephalogram (EEG) regression task. Despite being originally trained for image classification tasks, when fine-tuned on EEG data, this model shows a notable increase in performance compared to other models, including an identical architecture ViT trained without the ImageNet weights. This discovery challenges the traditional understanding of model generalization, suggesting that Transformer models pretrained on seemingly unrelated image data can provide valuable priors for EEG regression tasks with an appropriate fine-tuning pipeline. The success of this approach suggests that the features extracted by ViT models in the context of visual tasks can be readily transformed for the purpose of EEG predictive modeling. We recommend utilizing this methodology not only in neuroscience and related fields, but generally for any task where data collection is limited by practical, financial, or ethical constraints. Our results illuminate the potential of pretrained models on tasks that are clearly distinct from their original purpose.

Related papers

Building 6G Radio Foundation Models with Transformer Architectures [6.70088826174291]
Foundation deep learning (DL) models are designed to learn general, robust and adaptable representations of their target modality. These models are pretrained on large, unlabeled datasets using self-supervised learning (SSL) We propose and demonstrate the effectiveness of a Vision Transformer (ViT) as a radio foundation model for spectrogram learning.
arXiv Detail & Related papers (2024-11-15T07:01:44Z)
Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset. We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding. Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z)
EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model [39.363511340878624]
We present a novel EEG foundation model, namely EEGFormer, pretrained on large-scale compound EEG data. To validate the effectiveness of our model, we extensively evaluate it on various downstream tasks and assess the performance under different transfer settings.
arXiv Detail & Related papers (2024-01-11T17:36:24Z)
Large Transformers are Better EEG Learners [8.930281191465088]
AdaCT, plug-and-play Adapters designed for Time series data into 2D pseudo-images or text forms. AdaCTI transforms multi-channel or lengthy single-channel time series data into pseudo-images for fine-tuning pre-trained vision transformers. AdaCT-T converts short single-channel data into text for fine-tuning pre-trained language transformers.
arXiv Detail & Related papers (2023-08-20T12:54:17Z)
Performance Evaluation of Swin Vision Transformer Model using Gradient Accumulation Optimization Technique [0.0]
This paper evaluates the performance of Swin ViT model using gradient accumulation optimization technique. Applying the GAO technique leads to a significant decrease in the accuracy of the Swin ViT model.
arXiv Detail & Related papers (2023-07-31T23:30:16Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers [74.06040005144382]
Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications. We conduct a systematic empirical study in order to better understand the interplay between the amount of training data, AugReg, model size and compute budget. We train ViT models of various sizes on the public ImageNet-21k dataset which either match or outperform their counterparts trained on the larger, but not publicly available JFT-300M dataset.
arXiv Detail & Related papers (2021-06-18T17:58:20Z)
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model [58.17021225930069]
We explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) We propose a more efficient EAT model, and design task-related heads to deal with different tasks more flexibly. Our approach achieves state-of-the-art results on the ImageNet classification task compared with recent vision transformer works.
arXiv Detail & Related papers (2021-05-31T16:20:03Z)
Visformer: The Vision-friendly Transformer [105.52122194322592]
We propose a new architecture named Visformer, which is abbreviated from the Vision-friendly Transformer' With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy.
arXiv Detail & Related papers (2021-04-26T13:13:03Z)
Pre-Trained Image Processing Transformer [95.93031793337613]
We develop a new pre-trained model, namely, image processing transformer (IPT) We present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. IPT model is trained on these images with multi-heads and multi-tails.
arXiv Detail & Related papers (2020-12-01T09:42:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.