TTS-CGAN: A Transformer Time-Series Conditional GAN for Biosignal Data
Augmentation
- URL: http://arxiv.org/abs/2206.13676v1
- Date: Tue, 28 Jun 2022 01:01:34 GMT
- Title: TTS-CGAN: A Transformer Time-Series Conditional GAN for Biosignal Data
Augmentation
- Authors: Xiaomin Li, Anne Hee Hiong Ngu, Vangelis Metsis
- Abstract summary: We present TTS-CGAN, a conditional GAN model that can be trained on existing multi-class datasets and generate class-specific synthetic time-series sequences.
Synthetic sequences generated by our model are indistinguishable from real ones, and can be used to complement or replace real signals of the same type.
- Score: 5.607676459156789
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Signal measurement appearing in the form of time series is one of the most
common types of data used in medical machine learning applications. Such
datasets are often small in size, expensive to collect and annotate, and might
involve privacy issues, which hinders our ability to train large,
state-of-the-art deep learning models for biomedical applications. For
time-series data, the suite of data augmentation strategies we can use to
expand the size of the dataset is limited by the need to maintain the basic
properties of the signal. Generative Adversarial Networks (GANs) can be
utilized as another data augmentation tool. In this paper, we present TTS-CGAN,
a transformer-based conditional GAN model that can be trained on existing
multi-class datasets and generate class-specific synthetic time-series
sequences of arbitrary length. We elaborate on the model architecture and
design strategies. Synthetic sequences generated by our model are
indistinguishable from real ones, and can be used to complement or replace real
signals of the same type, thus achieving the goal of data augmentation. To
evaluate the quality of the generated data, we modify the wavelet coherence
metric to be able to compare the similarity between two sets of signals, and
also conduct a case study where a mix of synthetic and real data are used to
train a deep learning model for sequence classification. Together with other
visualization techniques and qualitative evaluation approaches, we demonstrate
that TTS-CGAN generated synthetic data are similar to real data, and that our
model performs better than the other state-of-the-art GAN models built for
time-series data generation.
Related papers
- Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - Generating Realistic Tabular Data with Large Language Models [49.03536886067729]
Large language models (LLM) have been used for diverse tasks, but do not capture the correct correlation between the features and the target variable.
We propose a LLM-based method with three important improvements to correctly capture the ground-truth feature-class correlation in the real data.
Our experiments show that our method significantly outperforms 10 SOTA baselines on 20 datasets in downstream tasks.
arXiv Detail & Related papers (2024-10-29T04:14:32Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - MADS: Modulated Auto-Decoding SIREN for time series imputation [9.673093148930874]
We propose MADS, a novel auto-decoding framework for time series imputation, built upon implicit neural representations.
We evaluate our model on two real-world datasets, and show that it outperforms state-of-the-art methods for time series imputation.
arXiv Detail & Related papers (2023-07-03T09:08:47Z) - TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z) - Time-series Transformer Generative Adversarial Networks [5.254093731341154]
We consider limitations posed specifically on time-series data and present a model that can generate synthetic time-series.
A model that generates synthetic time-series data has two objectives: 1) to capture the stepwise conditional distribution of real sequences, and 2) to faithfully model the joint distribution of entire real sequences.
We present TsT-GAN, a framework that capitalises on the Transformer architecture to satisfy the desiderata and compare its performance against five state-of-the-art models on five datasets.
arXiv Detail & Related papers (2022-05-23T10:04:21Z) - DATGAN: Integrating expert knowledge into deep learning for synthetic
tabular data [0.0]
Synthetic data can be used in various applications, such as correcting bias datasets or replacing scarce original data for simulation purposes.
Deep learning models are data-driven and it is difficult to control the generation process.
This article presents the Directed Acyclic Tabular GAN ( DATGAN) to address these limitations.
arXiv Detail & Related papers (2022-03-07T16:09:03Z) - TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network [4.989480853499916]
Time-series data is one of the most common types of data used in medical machine learning applications.
We introduce TTS-GAN, a transformer-based GAN which can successfully generate realistic synthetic time-series data sequences.
We use visualizations and dimensionality reduction techniques to demonstrate the similarity of real and generated time-series data.
arXiv Detail & Related papers (2022-02-06T03:05:47Z) - Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining.
We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z) - Towards Synthetic Multivariate Time Series Generation for Flare
Forecasting [5.098461305284216]
One of the limiting factors in training data-driven, rare-event prediction algorithms is the scarcity of the events of interest.
In this study, we explore the usefulness of the conditional generative adversarial network (CGAN) as a means to perform data-informed oversampling.
arXiv Detail & Related papers (2021-05-16T22:23:23Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.