Related papers: D3A-TS: Denoising-Driven Data Augmentation in Time Series

D3A-TS: Denoising-Driven Data Augmentation in Time Series

URL: http://arxiv.org/abs/2312.05550v1
Date: Sat, 9 Dec 2023 11:37:07 GMT
Title: D3A-TS: Denoising-Driven Data Augmentation in Time Series
Authors: David Solis-Martin, Juan Galan-Paez, Joaquin Borrego-Diaz
Abstract summary: This work focuses on studying and analyzing the use of different techniques for data augmentation in time series for classification and regression problems. The proposed approach involves the use of diffusion probabilistic models, which have recently achieved successful results in the field of Image Processing. The results highlight the high utility of this methodology in creating synthetic data to train classification and regression models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: It has been demonstrated that the amount of data is crucial in data-driven machine learning methods. Data is always valuable, but in some tasks, it is almost like gold. This occurs in engineering areas where data is scarce or very expensive to obtain, such as predictive maintenance, where faults are rare. In this context, a mechanism to generate synthetic data can be very useful. While in fields such as Computer Vision or Natural Language Processing synthetic data generation has been extensively explored with promising results, in other domains such as time series it has received less attention. This work specifically focuses on studying and analyzing the use of different techniques for data augmentation in time series for classification and regression problems. The proposed approach involves the use of diffusion probabilistic models, which have recently achieved successful results in the field of Image Processing, for data augmentation in time series. Additionally, the use of meta-attributes to condition the data augmentation process is investigated. The results highlight the high utility of this methodology in creating synthetic data to train classification and regression models. To assess the results, six different datasets from diverse domains were employed, showcasing versatility in terms of input size and output types. Finally, an extensive ablation study is conducted to further support the obtained outcomes.

Related papers

A Time-Series Data Augmentation Model through Diffusion and Transformer Integration [0.6437284704257459]
Deep neural networks typically require large volumes of data for training.<n>We propose a simple and effective method that combines the Diffusion and Transformer models.<n>Using the performance improvement of the model after applying augmented data as a benchmark, this approach shows its capability to produce high-quality augmented data.
arXiv Detail & Related papers (2025-05-01T09:40:45Z)
Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification [1.7614607439356635]
We explore the usefulness of synthetic data generated with different generative models from Deep Learning. We investigate the effects of transfer learning, by fine-tuning a synthetically pre-trained model and then adding increasing proportions of real data.
arXiv Detail & Related papers (2024-11-27T15:46:34Z)
MDM: Advancing Multi-Domain Distribution Matching for Automatic Modulation Recognition Dataset Synthesis [35.07663680944459]
Deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. The success of deep learning is all attributed to the training on large-scale datasets. In order to solve the problem of large amount of data, some researchers put forward the method of data distillation.
arXiv Detail & Related papers (2024-08-05T14:16:54Z)
Generative Expansion of Small Datasets: An Expansive Graph Approach [13.053285552524052]
We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples. An autoencoder with self-attention layers and optimal transport refines distributional consistency. Results show comparable performance, demonstrating the model's potential to augment training data effectively.
arXiv Detail & Related papers (2024-06-25T02:59:02Z)
Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution [62.71425232332837]
We show that training amortized models with noisy labels is inexpensive and surprisingly effective. This approach significantly accelerates several feature attribution and data valuation methods, often yielding an order of magnitude speedup over existing approaches.
arXiv Detail & Related papers (2024-01-29T03:42:37Z)
Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task. We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z)
On Inductive Biases for Machine Learning in Data Constrained Settings [0.0]
This thesis explores a different answer to the problem of learning expressive models in data constrained settings. Instead of relying on big datasets to learn neural networks, we will replace some modules by known functions reflecting the structure of the data. Our approach falls under the hood of "inductive biases", which can be defined as hypothesis on the data at hand restricting the space of models to explore.
arXiv Detail & Related papers (2023-02-21T14:22:01Z)
Time-Varying Propensity Score to Bridge the Gap between the Past and Present [104.46387765330142]
We introduce a time-varying propensity score that can detect gradual shifts in the distribution of data. We demonstrate different ways of implementing it and evaluate it on a variety of problems.
arXiv Detail & Related papers (2022-10-04T07:21:49Z)
Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks. Data augmentation induces these symmetries during training by applying multiple transformations to the input data. This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z)
Data Augmentation techniques in time series domain: A survey and taxonomy [0.20971479389679332]
Deep neural networks used to work with time series heavily depend on the size and consistency of the datasets used in training. This work systematically reviews the current state-of-the-art in the area to provide an overview of all available algorithms. The ultimate aim of this study is to provide a summary of the evolution and performance of areas that produce better results to guide future researchers in this field.
arXiv Detail & Related papers (2022-06-25T17:09:00Z)
Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation. We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z)
Towards Synthetic Multivariate Time Series Generation for Flare Forecasting [5.098461305284216]
One of the limiting factors in training data-driven, rare-event prediction algorithms is the scarcity of the events of interest. In this study, we explore the usefulness of the conditional generative adversarial network (CGAN) as a means to perform data-informed oversampling.
arXiv Detail & Related papers (2021-05-16T22:23:23Z)
Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information. The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.