Probabilistic Imputation for Time-series Classification with Missing
Data
- URL: http://arxiv.org/abs/2308.06738v1
- Date: Sun, 13 Aug 2023 10:04:13 GMT
- Title: Probabilistic Imputation for Time-series Classification with Missing
Data
- Authors: SeungHyun Kim, Hyunsu Kim, EungGu Yun, Hwangrae Lee, Jaehun Lee, Juho
Lee
- Abstract summary: We propose a novel framework for classification with time series data with missing values.
Our deep generative model part is trained to impute the missing values in multiple plausible ways.
The classifier part takes the time series data along with the imputed missing values and classifies signals.
- Score: 17.956329906475084
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multivariate time series data for real-world applications typically contain a
significant amount of missing values. The dominant approach for classification
with such missing values is to impute them heuristically with specific values
(zero, mean, values of adjacent time-steps) or learnable parameters. However,
these simple strategies do not take the data generative process into account,
and more importantly, do not effectively capture the uncertainty in prediction
due to the multiple possibilities for the missing values. In this paper, we
propose a novel probabilistic framework for classification with multivariate
time series data with missing values. Our model consists of two parts; a deep
generative model for missing value imputation and a classifier. Extending the
existing deep generative models to better capture structures of time-series
data, our deep generative model part is trained to impute the missing values in
multiple plausible ways, effectively modeling the uncertainty of the
imputation. The classifier part takes the time series data along with the
imputed missing values and classifies signals, and is trained to capture the
predictive uncertainty due to the multiple possibilities of imputations.
Importantly, we show that na\"ively combining the generative model and the
classifier could result in trivial solutions where the generative model does
not produce meaningful imputations. To resolve this, we present a novel
regularization technique that can promote the model to produce useful
imputation values that help classification. Through extensive experiments on
real-world time series data with missing values, we demonstrate the
effectiveness of our method.
Related papers
- An End-to-End Model for Time Series Classification In the Presence of Missing Values [25.129396459385873]
Time series classification with missing data is a prevalent issue in time series analysis.
This study proposes an end-to-end neural network that unifies data imputation and representation learning within a single framework.
arXiv Detail & Related papers (2024-08-11T19:39:12Z) - Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series [45.76310830281876]
We propose Quantile Sub-Ensembles, a novel method to estimate uncertainty with ensemble of quantile-regression-based task networks.
Our method not only produces accurate imputations that is robust to high missing rates, but also is computationally efficient due to the fast training of its non-generative model.
arXiv Detail & Related papers (2023-12-03T05:52:30Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - STING: Self-attention based Time-series Imputation Networks using GAN [4.052758394413726]
STING (Self-attention based Time-series Imputation Networks using GAN) is proposed.
We take advantage of generative adversarial networks and bidirectional recurrent neural networks to learn latent representations of the time series.
Experimental results on three real-world datasets demonstrate that STING outperforms the existing state-of-the-art methods in terms of imputation accuracy.
arXiv Detail & Related papers (2022-09-22T06:06:56Z) - Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task.
This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - CSDI: Conditional Score-based Diffusion Models for Probabilistic Time
Series Imputation [107.63407690972139]
Conditional Score-based Diffusion models for Imputation (CSDI) is a novel time series imputation method that utilizes score-based diffusion models conditioned on observed data.
CSDI improves by 40-70% over existing probabilistic imputation methods on popular performance metrics.
In addition, C reduces the error by 5-20% compared to the state-of-the-art deterministic imputation methods.
arXiv Detail & Related papers (2021-07-07T22:20:24Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Time-series Imputation and Prediction with Bi-Directional Generative
Adversarial Networks [0.3162999570707049]
We present a model for the combined task of imputing and predicting values for irregularly observed and varying length time-series data with missing entries.
Our model learns how to impute missing elements in-between (imputation) or outside of the input time steps (prediction), hence working as an effective any-time prediction tool for time-series data.
arXiv Detail & Related papers (2020-09-18T15:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.