Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG
- URL: http://arxiv.org/abs/2411.19230v1
- Date: Thu, 28 Nov 2024 15:53:32 GMT
- Title: Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG
- Authors: Xinxu Wei, Kanhao Zhao, Yong Jiao, Nancy B. Carlisle, Hua Xie, Yu Zhang,
- Abstract summary: We propose a Graph Contrastive Masked Autoencoder Distiller to bridge the gap between unlabeled/labeled and high/low-density EEG data.<n>For knowledge distillation from high-density to low-density EEG data, we propose a Graph Topology Distillation loss function.<n>We demonstrate the effectiveness of our method on four classification tasks across two clinical EEG datasets.
- Score: 4.006670302810497
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Effectively utilizing extensive unlabeled high-density EEG data to improve performance in scenarios with limited labeled low-density EEG data presents a significant challenge. In this paper, we address this by framing it as a graph transfer learning and knowledge distillation problem. We propose a Unified Pre-trained Graph Contrastive Masked Autoencoder Distiller, named EEG-DisGCMAE, to bridge the gap between unlabeled/labeled and high/low-density EEG data. To fully leverage the abundant unlabeled EEG data, we introduce a novel unified graph self-supervised pre-training paradigm, which seamlessly integrates Graph Contrastive Pre-training and Graph Masked Autoencoder Pre-training. This approach synergistically combines contrastive and generative pre-training techniques by reconstructing contrastive samples and contrasting the reconstructions. For knowledge distillation from high-density to low-density EEG data, we propose a Graph Topology Distillation loss function, allowing a lightweight student model trained on low-density data to learn from a teacher model trained on high-density data, effectively handling missing electrodes through contrastive distillation. To integrate transfer learning and distillation, we jointly pre-train the teacher and student models by contrasting their queries and keys during pre-training, enabling robust distillers for downstream tasks. We demonstrate the effectiveness of our method on four classification tasks across two clinical EEG datasets with abundant unlabeled data and limited labeled data. The experimental results show that our approach significantly outperforms contemporary methods in both efficiency and accuracy.
Related papers
- ECG Latent Feature Extraction with Autoencoders for Downstream Prediction Tasks [2.2616169634370076]
The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiac assessment.<n>Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals make it challenging to use in deep learning models.<n>This study addresses these challenges by exploring feature generation methods from representative beat ECGs.<n>We introduce three novel Variational Autoencoder (VAE) variants-Stochastic Autoencoder (SAE), Annealed beta-VAE (A beta-VAE), and Cyclical beta VAE (C beta-VAE)-and compare their effectiveness in maintaining
arXiv Detail & Related papers (2025-07-31T19:37:05Z) - Adversarial Curriculum Graph-Free Knowledge Distillation for Graph Neural Networks [61.608453110751206]
We propose a fast and high-quality data-free knowledge distillation approach for graph neural networks.
The proposed graph-free KD method (ACGKD) significantly reduces the spatial complexity of pseudo-graphs.
ACGKD eliminates the dimensional ambiguity between the student and teacher models by increasing the student's dimensions.
arXiv Detail & Related papers (2025-04-01T08:44:27Z) - Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation [82.39763984380625]
We introduce denoising score distillation (DSD), a surprisingly effective and novel approach for training high-quality generative models from low-quality data.
DSD pretrains a diffusion model exclusively on noisy, corrupted samples and then distills it into a one-step generator capable of producing refined, clean outputs.
arXiv Detail & Related papers (2025-03-10T17:44:46Z) - Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs [61.226857589092]
OOD detection on nodes in graph learning remains underexplored.<n>GNNSafe adapted energy-based detection to the graph domain with state-of-the-art performance.<n>We introduce DeGEM, which decomposes the learning process into two parts: a graph encoder that leverages topology information for node representations and an energy head that operates in latent space.
arXiv Detail & Related papers (2025-02-25T07:20:00Z) - Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data [51.745219224707384]
Graph Neural Networks (GNNs) have achieved remarkable performance through their message-passing mechanism.<n>Recent studies have highlighted the vulnerability of GNNs to backdoor attacks.<n>In this paper, we propose a practical backdoor mitigation framework, denoted as GRAPHNAD.
arXiv Detail & Related papers (2025-01-10T10:16:35Z) - Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance [0.0]
We create a large unsupervised pre-training dataset by combining ten public ECG databases.
We pre-train Vision Transformers using JEPA on this dataset and fine-tune them on various PTB-XL benchmarks.
arXiv Detail & Related papers (2024-10-02T08:25:57Z) - Designing Pre-training Datasets from Unlabeled Data for EEG Classification with Transformers [0.0]
We present a way to design several labeled datasets from unlabeled electroencephalogram (EEG) data.
These can then be used to pre-train transformers to learn representations of EEG signals.
We tested this method on an epileptic seizure forecasting task on the Temple University Seizure Detection Corpus.
arXiv Detail & Related papers (2024-09-23T13:26:13Z) - Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers.
At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information.
The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z) - How Homogenizing the Channel-wise Magnitude Can Enhance EEG Classification Model? [4.0871083166108395]
We propose a simple yet effective approach for EEG data pre-processing.
Our method first transforms the EEG data into an encoded image by an Inverted Channel-wise Magnitude Homogenization.
By doing so, we can improve the EEG learning process efficiently without using a huge Deep Learning network.
arXiv Detail & Related papers (2024-07-19T09:11:56Z) - CE-SSL: Computation-Efficient Semi-Supervised Learning for ECG-based Cardiovascular Diseases Detection [16.34314710823127]
We propose a computation-efficient semi-supervised learning paradigm (CE-SSL) for robust and computation-efficient CVDs detection using ECG.
It enables a robust adaptation of pre-trained models on downstream datasets with limited supervision and high computational efficiency.
CE-SSL not only outperforms the state-of-the-art methods in multi-label CVDs detection but also consumes fewer GPU footprints, training time, and parameter storage space.
arXiv Detail & Related papers (2024-06-20T14:45:13Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - MELEP: A Novel Predictive Measure of Transferability in Multi-Label ECG Diagnosis [1.3654846342364306]
We introduce MELEP, a measure designed to estimate the effectiveness of knowledge transfer from a pre-trained model to a downstream ECG diagnosis task.
Our experiments show that MELEP can predict the performance of pre-trained convolutional and recurrent deep neural networks, on small and imbalanced ECG data.
arXiv Detail & Related papers (2023-10-27T14:57:10Z) - DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial
Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment.
Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images.
This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation [96.92250565207017]
We study the data efficiency and selection for the dataset distillation task.
By re-formulating the dynamics of distillation, we provide insight into the inherent redundancy in the real dataset.
We find the most contributing samples based on their causal effects on the distillation.
arXiv Detail & Related papers (2023-05-28T06:53:41Z) - Boosting Facial Expression Recognition by A Semi-Supervised Progressive
Teacher [54.50747989860957]
We propose a semi-supervised learning algorithm named Progressive Teacher (PT) to utilize reliable FER datasets as well as large-scale unlabeled expression images for effective training.
Experiments on widely-used databases RAF-DB and FERPlus validate the effectiveness of our method, which achieves state-of-the-art performance with accuracy of 89.57% on RAF-DB.
arXiv Detail & Related papers (2022-05-28T07:47:53Z) - Learning to Generate Synthetic Training Data using Gradient Matching and
Implicit Differentiation [77.34726150561087]
This article explores various data distillation techniques that can reduce the amount of data required to successfully train deep networks.
Inspired by recent ideas, we suggest new data distillation techniques based on generative teaching networks, gradient matching, and the Implicit Function Theorem.
arXiv Detail & Related papers (2022-03-16T11:45:32Z) - Weakly-supervised Graph Meta-learning for Few-shot Node Classification [53.36828125138149]
We propose a new graph meta-learning framework -- Graph Hallucination Networks (Meta-GHN)
Based on a new robustness-enhanced episodic training, Meta-GHN is meta-learned to hallucinate clean node representations from weakly-labeled data.
Extensive experiments demonstrate the superiority of Meta-GHN over existing graph meta-learning studies.
arXiv Detail & Related papers (2021-06-12T22:22:10Z) - Data Augmentation for Enhancing EEG-based Emotion Recognition with Deep
Generative Models [13.56090099952884]
We propose three methods for augmenting EEG training data to enhance the performance of emotion recognition models.
For the full usage strategy, all of the generated data are augmented to the training dataset without judging the quality of the generated data.
The experimental results demonstrate that the augmented training datasets produced by our methods enhance the performance of EEG-based emotion recognition models.
arXiv Detail & Related papers (2020-06-04T21:23:09Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.