TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series
- URL: http://arxiv.org/abs/2305.11567v2
- Date: Tue, 9 Jul 2024 08:19:23 GMT
- Title: TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series
- Authors: Alexander Nikitin, Letizia Iannucci, Samuel Kaski,
- Abstract summary: Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
- Score: 61.436361263605114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporally indexed data are essential in a wide range of fields and of interest to machine learning researchers. Time series data, however, are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations and the application of existing and new data-intensive ML methods. A possible solution to this bottleneck is to generate synthetic data. In this work, we introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series. TSGM includes a broad repertoire of machine learning methods: generative models, probabilistic, and simulator-based approaches. The framework enables users to evaluate the quality of the produced data from different angles: similarity, downstream effectiveness, predictive consistency, diversity, and privacy. The framework is extensible, which allows researchers to rapidly implement their own methods and compare them in a shareable environment. TSGM was tested on open datasets and in production and proved to be beneficial in both cases. Additionally to the library, the project allows users to employ command line interfaces for synthetic data generation which lowers the entry threshold for those without a programming background.
Related papers
- GenRec: A Flexible Data Generator for Recommendations [1.384948712833979]
GenRec is a novel framework for generating synthetic user-item interactions that exhibit realistic and well-known properties.
The framework is based on a generative process based on latent factor modeling.
arXiv Detail & Related papers (2024-07-23T15:53:17Z) - MALLM-GAN: Multi-Agent Large Language Model as Generative Adversarial Network for Synthesizing Tabular Data [10.217822818544475]
We propose a framework to generate synthetic (tabular) data powered by large language models (LLMs)
Our approach significantly enhance the quality of synthetic data generation in common scenarios with small sample sizes.
Our results demonstrate that our model outperforms several state-of-art models regarding generating higher quality synthetic data for downstream tasks while keeping privacy of the real data.
arXiv Detail & Related papers (2024-06-15T06:26:17Z) - Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models.
ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task.
This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z) - TTS-CGAN: A Transformer Time-Series Conditional GAN for Biosignal Data
Augmentation [5.607676459156789]
We present TTS-CGAN, a conditional GAN model that can be trained on existing multi-class datasets and generate class-specific synthetic time-series sequences.
Synthetic sequences generated by our model are indistinguishable from real ones, and can be used to complement or replace real signals of the same type.
arXiv Detail & Related papers (2022-06-28T01:01:34Z) - Towards Generating Real-World Time Series Data [52.51620668470388]
We propose a novel generative framework for time series data generation - RTSGAN.
RTSGAN learns an encoder-decoder module which provides a mapping between a time series instance and a fixed-dimension latent vector.
To generate time series with missing values, we further equip RTSGAN with an observation embedding layer and a decide-and-generate decoder.
arXiv Detail & Related papers (2021-11-16T11:31:37Z) - TimeVAE: A Variational Auto-Encoder for Multivariate Time Series
Generation [6.824692201913679]
We propose a novel architecture for synthetically generating time-series data with the use of Variversaational Auto-Encoders (VAEs)
The proposed architecture has several distinct properties: interpretability, ability to encode domain knowledge, and reduced training times.
arXiv Detail & Related papers (2021-11-15T21:42:14Z) - PIETS: Parallelised Irregularity Encoders for Forecasting with
Heterogeneous Time-Series [5.911865723926626]
Heterogeneity and irregularity of multi-source data sets present a significant challenge to time-series analysis.
In this work, we design a novel architecture, PIETS, to model heterogeneous time-series.
We show that PIETS is able to effectively model heterogeneous temporal data and outperforms other state-of-the-art approaches in the prediction task.
arXiv Detail & Related papers (2021-09-30T20:01:19Z) - Merlion: A Machine Learning Library for Time Series [73.46386700728577]
Merlion is an open-source machine learning library for time series.
It features a unified interface for models and datasets for anomaly detection and forecasting.
Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production.
arXiv Detail & Related papers (2021-09-20T02:03:43Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.