A Bayesian Generative Adversarial Network (GAN) to Generate Synthetic
Time-Series Data, Application in Combined Sewer Flow Prediction
- URL: http://arxiv.org/abs/2301.13733v1
- Date: Tue, 31 Jan 2023 16:12:26 GMT
- Title: A Bayesian Generative Adversarial Network (GAN) to Generate Synthetic
Time-Series Data, Application in Combined Sewer Flow Prediction
- Authors: Amin E. Bakhshipour, Alireza Koochali, Ulrich Dittmer, Ali Haghighi,
Sheraz Ahmad, Andreas Dengel
- Abstract summary: In machine learning, generative models are a class of methods capable of learning data distribution to generate artificial data.
In this study, we developed a GAN model to generate synthetic time series to balance our limited recorded time series data.
The aim is to predict the flow using precipitation data and examine the impact of data augmentation using synthetic data in model performance.
- Score: 3.3139597764446607
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite various breakthroughs in machine learning and data analysis
techniques for improving smart operation and management of urban water
infrastructures, some key limitations obstruct this progress. Among these
shortcomings, the absence of freely available data due to data privacy or high
costs of data gathering and the nonexistence of adequate rare or extreme events
in the available data plays a crucial role. Here, Generative Adversarial
Networks (GANs) can help overcome these challenges. In machine learning,
generative models are a class of methods capable of learning data distribution
to generate artificial data. In this study, we developed a GAN model to
generate synthetic time series to balance our limited recorded time series data
and improve the accuracy of a data-driven model for combined sewer flow
prediction. We considered the sewer system of a small town in Germany as the
test case. Precipitation and inflow to the storage tanks are used for the
Data-Driven model development. The aim is to predict the flow using
precipitation data and examine the impact of data augmentation using synthetic
data in model performance. Results show that GAN can successfully generate
synthetic time series from real data distribution, which helps more accurate
peak flow prediction. However, the model without data augmentation works better
for dry weather prediction. Therefore, an ensemble model is suggested to
combine the advantages of both models.
Related papers
- Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - Enhancing Indoor Temperature Forecasting through Synthetic Data in Low-Data Environments [42.8983261737774]
We investigate the efficacy of data augmentation techniques leveraging SoTA AI-based methods for synthetic data generation.
Inspired by practical and experimental motivations, we explore fusion strategies of real and synthetic data to improve forecasting models.
arXiv Detail & Related papers (2024-06-07T12:36:31Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Private Synthetic Data Meets Ensemble Learning [15.425653946755025]
When machine learning models are trained on synthetic data and then deployed on real data, there is often a performance drop.
We introduce a new ensemble strategy for training downstream models, with the goal of enhancing their performance when used on real data.
arXiv Detail & Related papers (2023-10-15T04:24:42Z) - On the Stability of Iterative Retraining of Generative Models on their own Data [56.153542044045224]
We study the impact of training generative models on mixed datasets.
We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough.
We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-09-30T16:41:04Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - An advanced spatio-temporal convolutional recurrent neural network for
storm surge predictions [73.4962254843935]
We study the capability of artificial neural network models to emulate storm surge based on the storm track/size/intensity history.
This study presents a neural network model that can predict storm surge, informed by a database of synthetic storm simulations.
arXiv Detail & Related papers (2022-04-18T23:42:18Z) - Variational Autoencoder Generative Adversarial Network for Synthetic
Data Generation in Smart Home [15.995891934245334]
We propose a Variational AutoEncoder Geneversarative Adrial Network (VAE-GAN) as a smart grid data generative model.
VAE-GAN is capable of learning various types of data distributions and generating plausible samples from the same distribution.
Experiments indicate that the proposed synthetic data generative model outperforms the vanilla GAN network.
arXiv Detail & Related papers (2022-01-19T02:30:25Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - STAN: Synthetic Network Traffic Generation with Generative Neural Models [10.54843182184416]
This paper presents STAN (Synthetic network Traffic generation with Autoregressive Neural models), a tool to generate realistic synthetic network traffic datasets.
Our novel neural architecture captures both temporal dependencies and dependence between attributes at any given time.
We evaluate the performance of STAN in terms of the quality of data generated, by training it on both a simulated dataset and a real network traffic data set.
arXiv Detail & Related papers (2020-09-27T04:20:02Z) - High Temporal Resolution Rainfall Runoff Modelling Using
Long-Short-Term-Memory (LSTM) Networks [0.03694429692322631]
The model was tested for a watershed in Houston, TX, known for severe flood events.
The LSTM network's capability in learning long-term dependencies between the input and output of the network allowed modeling RR with high resolution in time.
arXiv Detail & Related papers (2020-02-07T00:38:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.