Variational Autoencoder Generative Adversarial Network for Synthetic
Data Generation in Smart Home
- URL: http://arxiv.org/abs/2201.07387v1
- Date: Wed, 19 Jan 2022 02:30:25 GMT
- Title: Variational Autoencoder Generative Adversarial Network for Synthetic
Data Generation in Smart Home
- Authors: Mina Razghandi, Hao Zhou, Melike Erol-Kantarci, and Damla Turgut
- Abstract summary: We propose a Variational AutoEncoder Geneversarative Adrial Network (VAE-GAN) as a smart grid data generative model.
VAE-GAN is capable of learning various types of data distributions and generating plausible samples from the same distribution.
Experiments indicate that the proposed synthetic data generative model outperforms the vanilla GAN network.
- Score: 15.995891934245334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data is the fuel of data science and machine learning techniques for smart
grid applications, similar to many other fields. However, the availability of
data can be an issue due to privacy concerns, data size, data quality, and so
on. To this end, in this paper, we propose a Variational AutoEncoder Generative
Adversarial Network (VAE-GAN) as a smart grid data generative model which is
capable of learning various types of data distributions and generating
plausible samples from the same distribution without performing any prior
analysis on the data before the training phase.We compared the Kullback-Leibler
(KL) divergence, maximum mean discrepancy (MMD), and Wasserstein distance
between the synthetic data (electrical load and PV production) distribution
generated by the proposed model, vanilla GAN network, and the real data
distribution, to evaluate the performance of our model. Furthermore, we used
five key statistical parameters to describe the smart grid data distribution
and compared them between synthetic data generated by both models and real
data. Experiments indicate that the proposed synthetic data generative model
outperforms the vanilla GAN network. The distribution of VAE-GAN synthetic data
is the most comparable to that of real data.
Related papers
- FLIGAN: Enhancing Federated Learning with Incomplete Data using GAN [1.5749416770494706]
Federated Learning (FL) provides a privacy-preserving mechanism for distributed training of machine learning models on networked devices.
We propose FLIGAN, a novel approach to address the issue of data incompleteness in FL.
Our methodology adheres to FL's privacy requirements by generating synthetic data in a federated manner without sharing the actual data in the process.
arXiv Detail & Related papers (2024-03-25T16:49:38Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Generative Modeling for Tabular Data via Penalized Optimal Transport
Network [2.0319002824093015]
Wasserstein generative adversarial network (WGAN) is a notable improvement in generative modeling.
We propose POTNet, a generative deep neural network based on a novel, robust, and interpretable marginally-penalized Wasserstein (MPW) loss.
arXiv Detail & Related papers (2024-02-16T05:27:05Z) - Private Synthetic Data Meets Ensemble Learning [15.425653946755025]
When machine learning models are trained on synthetic data and then deployed on real data, there is often a performance drop.
We introduce a new ensemble strategy for training downstream models, with the goal of enhancing their performance when used on real data.
arXiv Detail & Related papers (2023-10-15T04:24:42Z) - On the Stability of Iterative Retraining of Generative Models on their own Data [56.153542044045224]
We study the impact of training generative models on mixed datasets.
We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough.
We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-09-30T16:41:04Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Targeted Analysis of High-Risk States Using an Oriented Variational
Autoencoder [3.494548275937873]
Variational autoencoder (VAE) neural networks can be trained to generate power system states.
The coordinates of the latent space codes of VAEs have been shown to correlate with conceptual features of the data.
In this paper, an oriented variation autoencoder (OVAE) is proposed to constrain the link between latent space code and generated data.
arXiv Detail & Related papers (2023-03-20T19:34:21Z) - Distributed Traffic Synthesis and Classification in Edge Networks: A
Federated Self-supervised Learning Approach [83.2160310392168]
This paper proposes FS-GAN to support automatic traffic analysis and synthesis over a large number of heterogeneous datasets.
FS-GAN is composed of multiple distributed Generative Adversarial Networks (GANs)
FS-GAN can classify data of unknown types of service and create synthetic samples that capture the traffic distribution of the unknown types.
arXiv Detail & Related papers (2023-02-01T03:23:11Z) - A Bayesian Generative Adversarial Network (GAN) to Generate Synthetic
Time-Series Data, Application in Combined Sewer Flow Prediction [3.3139597764446607]
In machine learning, generative models are a class of methods capable of learning data distribution to generate artificial data.
In this study, we developed a GAN model to generate synthetic time series to balance our limited recorded time series data.
The aim is to predict the flow using precipitation data and examine the impact of data augmentation using synthetic data in model performance.
arXiv Detail & Related papers (2023-01-31T16:12:26Z) - A Generative Approach for Production-Aware Industrial Network Traffic
Modeling [70.46446906513677]
We investigate the network traffic data generated from a laser cutting machine deployed in a Trumpf factory in Germany.
We analyze the traffic statistics, capture the dependencies between the internal states of the machine, and model the network traffic as a production state dependent process.
We compare the performance of various generative models including variational autoencoder (VAE), conditional variational autoencoder (CVAE), and generative adversarial network (GAN)
arXiv Detail & Related papers (2022-11-11T09:46:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.