EEG Synthetic Data Generation Using Probabilistic Diffusion Models
- URL: http://arxiv.org/abs/2303.06068v1
- Date: Mon, 6 Mar 2023 12:03:22 GMT
- Title: EEG Synthetic Data Generation Using Probabilistic Diffusion Models
- Authors: Giulio Tosato, Cesare M. Dalbagno, Francesco Fumagalli
- Abstract summary: This study proposes an advanced methodology for data augmentation: generating synthetic EEG data using denoising diffusion probabilistic models.
The synthetic data are generated from electrode-frequency distribution maps (EFDMs) of emotionally labeled EEG recordings.
The proposed methodology has potential implications for the broader field of neuroscience research by enabling the creation of large, publicly available synthetic EEG datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Electroencephalography (EEG) plays a significant role in the Brain Computer
Interface (BCI) domain, due to its non-invasive nature, low cost, and ease of
use, making it a highly desirable option for widespread adoption by the general
public. This technology is commonly used in conjunction with deep learning
techniques, the success of which is largely dependent on the quality and
quantity of data used for training. To address the challenge of obtaining
sufficient EEG data from individual participants while minimizing user effort
and maintaining accuracy, this study proposes an advanced methodology for data
augmentation: generating synthetic EEG data using denoising diffusion
probabilistic models. The synthetic data are generated from electrode-frequency
distribution maps (EFDMs) of emotionally labeled EEG recordings. To assess the
validity of the synthetic data generated, both a qualitative and a quantitative
comparison with real EEG data were successfully conducted. This study opens up
the possibility for an open\textendash source accessible and versatile toolbox
that can process and generate data in both time and frequency dimensions,
regardless of the number of channels involved. Finally, the proposed
methodology has potential implications for the broader field of neuroscience
research by enabling the creation of large, publicly available synthetic EEG
datasets without privacy concerns.
Related papers
- Dataset Refinement for Improving the Generalization Ability of the EEG Decoding Model [2.9972387721489655]
We propose a dataset refinement algorithm to eliminate noisy data from EEG datasets.
The proposed algorithm consistently led to better generalization performance compared to using the original dataset.
We conclude that removing noisy data from the training dataset alone can effectively improve the generalization performance of deep learning models in the EEG domain.
arXiv Detail & Related papers (2024-10-31T05:08:24Z) - Enhancing EEG Signal Generation through a Hybrid Approach Integrating Reinforcement Learning and Diffusion Models [6.102274021710727]
This study introduces an innovative approach to the synthesis of Electroencephalogram (EEG) signals by integrating diffusion models with reinforcement learning.
Our methodology enhances the generation of EEG signals with detailed temporal and spectral features, enriching the authenticity and diversity of synthetic datasets.
arXiv Detail & Related papers (2024-09-14T07:22:31Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Guided Discrete Diffusion for Electronic Health Record Generation [47.129056768385084]
EHRs are a pivotal data source that enables numerous applications in computational medicine, e.g., disease progression prediction, clinical trial design, and health economics and outcomes research.
Despite wide usability, their sensitive nature raises privacy and confidentially concerns, which limit potential use cases.
To tackle these challenges, we explore the use of generative models to synthesize artificial, yet realistic EHRs.
arXiv Detail & Related papers (2024-04-18T16:50:46Z) - Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models.
ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task.
This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Synthetic data generation for a longitudinal cohort study -- Evaluation,
method extension and reproduction of published data analysis results [0.32593385688760446]
In the health sector, access to individual-level data is often challenging due to privacy concerns.
A promising alternative is the generation of fully synthetic data.
In this study, we use a state-of-the-art synthetic data generation method.
arXiv Detail & Related papers (2023-05-12T13:13:55Z) - Beyond Privacy: Navigating the Opportunities and Challenges of Synthetic
Data [91.52783572568214]
Synthetic data may become a dominant force in the machine learning world, promising a future where datasets can be tailored to individual needs.
We discuss which fundamental challenges the community needs to overcome for wider relevance and application of synthetic data.
arXiv Detail & Related papers (2023-04-07T16:38:40Z) - GANSER: A Self-supervised Data Augmentation Framework for EEG-based
Emotion Recognition [15.812231441367022]
We propose a novel data augmentation framework, namely Generative Adversarial Network-based Self-supervised Data Augmentation (GANSER)
As the first to combine adversarial training with self-supervised learning for EEG-based emotion recognition, the proposed framework can generate high-quality simulated EEG samples.
A transformation function is employed to mask parts of EEG signals and force the generator to synthesize potential EEG signals based on the remaining parts.
arXiv Detail & Related papers (2021-09-07T14:42:55Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Data Augmentation for Enhancing EEG-based Emotion Recognition with Deep
Generative Models [13.56090099952884]
We propose three methods for augmenting EEG training data to enhance the performance of emotion recognition models.
For the full usage strategy, all of the generated data are augmented to the training dataset without judging the quality of the generated data.
The experimental results demonstrate that the augmented training datasets produced by our methods enhance the performance of EEG-based emotion recognition models.
arXiv Detail & Related papers (2020-06-04T21:23:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.