Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG
- URL: http://arxiv.org/abs/2411.19230v1
- Date: Thu, 28 Nov 2024 15:53:32 GMT
- Title: Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG
- Authors: Xinxu Wei, Kanhao Zhao, Yong Jiao, Nancy B. Carlisle, Hua Xie, Yu Zhang,
- Abstract summary: We propose a Graph Contrastive Masked Autoencoder Distiller to bridge the gap between unlabeled/labeled and high/low-density EEG data.
For knowledge distillation from high-density to low-density EEG data, we propose a Graph Topology Distillation loss function.
We demonstrate the effectiveness of our method on four classification tasks across two clinical EEG datasets.
- Score: 4.006670302810497
- License:
- Abstract: Effectively utilizing extensive unlabeled high-density EEG data to improve performance in scenarios with limited labeled low-density EEG data presents a significant challenge. In this paper, we address this by framing it as a graph transfer learning and knowledge distillation problem. We propose a Unified Pre-trained Graph Contrastive Masked Autoencoder Distiller, named EEG-DisGCMAE, to bridge the gap between unlabeled/labeled and high/low-density EEG data. To fully leverage the abundant unlabeled EEG data, we introduce a novel unified graph self-supervised pre-training paradigm, which seamlessly integrates Graph Contrastive Pre-training and Graph Masked Autoencoder Pre-training. This approach synergistically combines contrastive and generative pre-training techniques by reconstructing contrastive samples and contrasting the reconstructions. For knowledge distillation from high-density to low-density EEG data, we propose a Graph Topology Distillation loss function, allowing a lightweight student model trained on low-density data to learn from a teacher model trained on high-density data, effectively handling missing electrodes through contrastive distillation. To integrate transfer learning and distillation, we jointly pre-train the teacher and student models by contrasting their queries and keys during pre-training, enabling robust distillers for downstream tasks. We demonstrate the effectiveness of our method on four classification tasks across two clinical EEG datasets with abundant unlabeled data and limited labeled data. The experimental results show that our approach significantly outperforms contemporary methods in both efficiency and accuracy.
Related papers
- Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance [0.0]
We create a large unsupervised pre-training dataset by combining ten public ECG databases.
We pre-train Vision Transformers using JEPA on this dataset and fine-tune them on various PTB-XL benchmarks.
arXiv Detail & Related papers (2024-10-02T08:25:57Z) - Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers.
At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information.
The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z) - How Homogenizing the Channel-wise Magnitude Can Enhance EEG Classification Model? [4.0871083166108395]
We propose a simple yet effective approach for EEG data pre-processing.
Our method first transforms the EEG data into an encoded image by an Inverted Channel-wise Magnitude Homogenization.
By doing so, we can improve the EEG learning process efficiently without using a huge Deep Learning network.
arXiv Detail & Related papers (2024-07-19T09:11:56Z) - Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation [62.30570286073223]
Diffusion-based text-to-image generation models have demonstrated the ability to produce images aligned with textual descriptions.
We introduce a data-free guided distillation method that enables the efficient distillation of pretrained Diffusion models without access to the real training data.
By exclusively training with synthetic images generated by its one-step generator, our data-free distillation method rapidly improves FID and CLIP scores, achieving state-of-the-art FID performance while maintaining a competitive CLIP score.
arXiv Detail & Related papers (2024-06-03T17:44:11Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - MELEP: A Novel Predictive Measure of Transferability in Multi-Label ECG Diagnosis [1.3654846342364306]
We introduce MELEP, a measure designed to estimate the effectiveness of knowledge transfer from a pre-trained model to a downstream ECG diagnosis task.
Our experiments show that MELEP can predict the performance of pre-trained convolutional and recurrent deep neural networks, on small and imbalanced ECG data.
arXiv Detail & Related papers (2023-10-27T14:57:10Z) - Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation [96.92250565207017]
We study the data efficiency and selection for the dataset distillation task.
By re-formulating the dynamics of distillation, we provide insight into the inherent redundancy in the real dataset.
We find the most contributing samples based on their causal effects on the distillation.
arXiv Detail & Related papers (2023-05-28T06:53:41Z) - Boosting Facial Expression Recognition by A Semi-Supervised Progressive
Teacher [54.50747989860957]
We propose a semi-supervised learning algorithm named Progressive Teacher (PT) to utilize reliable FER datasets as well as large-scale unlabeled expression images for effective training.
Experiments on widely-used databases RAF-DB and FERPlus validate the effectiveness of our method, which achieves state-of-the-art performance with accuracy of 89.57% on RAF-DB.
arXiv Detail & Related papers (2022-05-28T07:47:53Z) - Learning to Generate Synthetic Training Data using Gradient Matching and
Implicit Differentiation [77.34726150561087]
This article explores various data distillation techniques that can reduce the amount of data required to successfully train deep networks.
Inspired by recent ideas, we suggest new data distillation techniques based on generative teaching networks, gradient matching, and the Implicit Function Theorem.
arXiv Detail & Related papers (2022-03-16T11:45:32Z) - Data Augmentation for Enhancing EEG-based Emotion Recognition with Deep
Generative Models [13.56090099952884]
We propose three methods for augmenting EEG training data to enhance the performance of emotion recognition models.
For the full usage strategy, all of the generated data are augmented to the training dataset without judging the quality of the generated data.
The experimental results demonstrate that the augmented training datasets produced by our methods enhance the performance of EEG-based emotion recognition models.
arXiv Detail & Related papers (2020-06-04T21:23:09Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.