Long-Term Missing Value Imputation for Time Series Data Using Deep
Neural Networks
- URL: http://arxiv.org/abs/2202.12441v1
- Date: Fri, 25 Feb 2022 00:29:30 GMT
- Title: Long-Term Missing Value Imputation for Time Series Data Using Deep
Neural Networks
- Authors: Jangho Park, Juliane Muller, Bhavna Arora, Boris Faybishenko, Gilberto
Pastorello, Charuleka Varadharajan, Reetik Sahu, Deborah Agarwal
- Abstract summary: We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP) for estimating the missing values of a variable.
We focus on filling a long continuous gap rather than filling individual randomly missing observations.
Our approach enables the use of datasets that have a large gap in one variable, which is common in many long-term environmental monitoring observations.
- Score: 1.2019888796331233
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an approach that uses a deep learning model, in particular, a
MultiLayer Perceptron (MLP), for estimating the missing values of a variable in
multivariate time series data. We focus on filling a long continuous gap (e.g.,
multiple months of missing daily observations) rather than on individual
randomly missing observations. Our proposed gap filling algorithm uses an
automated method for determining the optimal MLP model architecture, thus
allowing for optimal prediction performance for the given time series. We
tested our approach by filling gaps of various lengths (three months to three
years) in three environmental datasets with different time series
characteristics, namely daily groundwater levels, daily soil moisture, and
hourly Net Ecosystem Exchange. We compared the accuracy of the gap-filled
values obtained with our approach to the widely-used R-based time series gap
filling methods ImputeTS and mtsdi. The results indicate that using an MLP for
filling a large gap leads to better results, especially when the data behave
nonlinearly. Thus, our approach enables the use of datasets that have a large
gap in one variable, which is common in many long-term environmental monitoring
observations.
Related papers
- Multi-Scale Dilated Convolution Network for Long-Term Time Series Forecasting [17.132063819650355]
We propose Multi Scale Dilated Convolution Network (MSDCN) to capture the period and trend characteristics of long time series.
We design different convolution blocks with exponentially growing dilations and varying kernel sizes to sample time series data at different scales.
To validate the effectiveness of the proposed approach, we conduct experiments on eight challenging long-term time series forecasting benchmark datasets.
arXiv Detail & Related papers (2024-05-09T02:11:01Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Generative Time Series Forecasting with Diffusion, Denoise, and
Disentanglement [51.55157852647306]
Time series forecasting has been a widely explored task of great importance in many applications.
It is common that real-world time series data are recorded in a short time period, which results in a big gap between the deep model and the limited and noisy time series.
We propose to address the time series forecasting problem with generative modeling and propose a bidirectional variational auto-encoder equipped with diffusion, denoise, and disentanglement.
arXiv Detail & Related papers (2023-01-08T12:20:46Z) - Grouped self-attention mechanism for a memory-efficient Transformer [64.0125322353281]
Real-world tasks such as forecasting weather, electricity consumption, and stock market involve predicting data that vary over time.
Time-series data are generally recorded over a long period of observation with long sequences owing to their periodic characteristics and long-range dependencies over time.
We propose two novel modules, Grouped Self-Attention (GSA) and Compressed Cross-Attention (CCA)
Our proposed model efficiently exhibited reduced computational complexity and performance comparable to or better than existing methods.
arXiv Detail & Related papers (2022-10-02T06:58:49Z) - Deep Generative model with Hierarchical Latent Factors for Time Series
Anomaly Detection [40.21502451136054]
This work presents DGHL, a new family of generative models for time series anomaly detection.
A top-down Convolution Network maps a novel hierarchical latent space to time series windows, exploiting temporal dynamics to encode information efficiently.
Our method outperformed current state-of-the-art models on four popular benchmark datasets.
arXiv Detail & Related papers (2022-02-15T17:19:44Z) - Time Series Anomaly Detection by Cumulative Radon Features [32.36217153362305]
In this work, we argue that shallow features suffice when combined with distribution distance measures.
Our approach models each time series as a high dimensional empirical distribution of features, where each time-point constitutes a single sample.
We show that by parameterizing each time series using cumulative Radon features, we are able to efficiently and effectively model the distribution of normal time series.
arXiv Detail & Related papers (2022-02-08T18:58:53Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Rainfall-Runoff Prediction at Multiple Timescales with a Single Long
Short-Term Memory Network [41.33870234564485]
Long Short-Term Memory Networks (LSTMs) have been applied to daily discharge prediction with remarkable success.
Many practical scenarios, however, require predictions at more granular timescales.
In this study, we propose two Multi-Timescale LSTM (MTS-LSTM) architectures that jointly predict multiple timescales within one model.
We test these models on 516 basins across the continental United States and benchmark against the US National Water Model.
arXiv Detail & Related papers (2020-10-15T17:52:16Z) - Multivariate Time-series Anomaly Detection via Graph Attention Network [27.12694738711663]
Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications.
One major limitation is that they do not capture the relationships between different time-series explicitly.
We propose a novel self-supervised framework for multivariate time-series anomaly detection to address this issue.
arXiv Detail & Related papers (2020-09-04T07:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.