RDIS: Random Drop Imputation with Self-Training for Incomplete Time
Series Data
- URL: http://arxiv.org/abs/2010.10075v1
- Date: Tue, 20 Oct 2020 07:04:25 GMT
- Title: RDIS: Random Drop Imputation with Self-Training for Incomplete Time
Series Data
- Authors: Tae-Min Choi, Ji-Su Kang, Jong-Hwan Kim
- Abstract summary: This paper proposes Random Drop Imputation with Self-training (RDIS), a novel training method for imputation networks for incomplete time-series data.
In RDIS, there are extra missing values by applying a random drop on the given incomplete data.
Also, self-training is introduced to exploit the original missing values without ground truth.
- Score: 5.762908115928466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is common that time-series data with missing values are encountered in
many fields such as in finance, meteorology, and robotics. Imputation is an
intrinsic method to handle such missing values. In the previous research, most
of imputation networks were trained implicitly for the incomplete time series
data because missing values have no ground truth. This paper proposes Random
Drop Imputation with Self-training (RDIS), a novel training method for
imputation networks for the incomplete time-series data. In RDIS, there are
extra missing values by applying a random drop on the given incomplete data
such that the imputation network can explicitly learn by imputing the random
drop values. Also, self-training is introduced to exploit the original missing
values without ground truth. To verify the effectiveness of our RDIS on
imputation tasks, we graft RDIS to a bidirectional GRU and achieve
state-of-the-art results on two real-world datasets, an air quality dataset and
a gas sensor dataset with 7.9% and 5.8% margin, respectively.
Related papers
- ImputeINR: Time Series Imputation via Implicit Neural Representations for Disease Diagnosis with Missing Data [12.517524696964319]
ImputeINR is a novel approach to time series imputation.<n>It generates fine-grained imputations even on extremely sparse observed values.
arXiv Detail & Related papers (2025-05-16T04:50:15Z) - DispFormer: Pretrained Transformer for Flexible Dispersion Curve Inversion from Global Synthesis to Regional Applications [59.488352977043974]
This study proposes DispFormer, a transformer-based neural network for inverting the $v_s$ profile from Rayleigh-wave phase and group dispersion curves.
Results indicate that zero-shot DispFormer, even without any labeled data, produces inversion profiles that match well with the ground truth.
arXiv Detail & Related papers (2025-01-08T09:08:24Z) - Time Series Imputation with Multivariate Radial Basis Function Neural Network [1.6804613362826175]
We propose a time series imputation model based on the Radial Basis Functions Neural Network (RBFNN)
Our imputation model learns local information from timestamps to create a continuous function.
We propose an extension called the Missing Value Imputation Recurrent Neural Network with Continuous Function (MIRNN-CF) using the continuous function generated by MIM-RBFNN.
arXiv Detail & Related papers (2024-07-24T07:02:16Z) - DiffPuter: Empowering Diffusion Models for Missing Data Imputation [56.48119008663155]
This paper introduces DiffPuter, a tailored diffusion model combined with the Expectation-Maximization (EM) algorithm for missing data imputation.<n>Our theoretical analysis shows that DiffPuter's training step corresponds to the maximum likelihood estimation of data density.<n>Our experiments show that DiffPuter achieves an average improvement of 6.94% in MAE and 4.78% in RMSE compared to the most competitive existing method.
arXiv Detail & Related papers (2024-05-31T08:35:56Z) - Filling out the missing gaps: Time Series Imputation with
Semi-Supervised Learning [7.8379910349669]
We propose a semi-supervised imputation method, ST-Impute, that uses both unlabeled data along with downstream task's labeled data.
ST-Impute is based on sparse self-attention and trains on tasks that mimic the imputation process.
arXiv Detail & Related papers (2023-04-09T16:38:47Z) - STING: Self-attention based Time-series Imputation Networks using GAN [4.052758394413726]
STING (Self-attention based Time-series Imputation Networks using GAN) is proposed.
We take advantage of generative adversarial networks and bidirectional recurrent neural networks to learn latent representations of the time series.
Experimental results on three real-world datasets demonstrate that STING outperforms the existing state-of-the-art methods in terms of imputation accuracy.
arXiv Detail & Related papers (2022-09-22T06:06:56Z) - Reduced Robust Random Cut Forest for Out-Of-Distribution detection in
machine learning models [0.799536002595393]
Most machine learning-based regressors extract information from data collected via past observations of limited length to make predictions in the future.
When input to these trained models is data with significantly different statistical properties from data used for training, there is no guarantee of accurate prediction.
We introduce a novel approach for this detection process using a Reduced Robust Random Cut Forest data structure.
arXiv Detail & Related papers (2022-06-18T17:01:40Z) - Networked Time Series Prediction with Incomplete Data [59.45358694862176]
We propose NETS-ImpGAN, a novel deep learning framework that can be trained on incomplete data with missing values in both history and future.
We conduct extensive experiments on three real-world datasets under different missing patterns and missing rates.
arXiv Detail & Related papers (2021-10-05T18:20:42Z) - MLReal: Bridging the gap between training on synthetic data and real
data applications in machine learning [1.9852463786440129]
We describe a novel approach to enhance supervised training on synthetic data with real data features.
In the training stage, the input data are from the synthetic domain and the auto-correlated data are from the real domain.
In the inference/application stage, the input data are from the real subset domain and the mean of the autocorrelated sections are from the synthetic data subset domain.
arXiv Detail & Related papers (2021-09-11T14:43:34Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Time-Series Imputation with Wasserstein Interpolation for Optimal
Look-Ahead-Bias and Variance Tradeoff [66.59869239999459]
In finance, imputation of missing returns may be applied prior to training a portfolio optimization model.
There is an inherent trade-off between the look-ahead-bias of using the full data set for imputation and the larger variance in the imputation from using only the training data.
We propose a Bayesian posterior consensus distribution which optimally controls the variance and look-ahead-bias trade-off in the imputation.
arXiv Detail & Related papers (2021-02-25T09:05:35Z) - HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep
Neural Networks [51.143054943431665]
We propose Hypergradient Data Relevance Analysis, or HYDRA, which interprets predictions made by deep neural networks (DNNs) as effects of their training data.
HYDRA assesses the contribution of training data toward test data points throughout the training trajectory.
In addition, we quantitatively demonstrate that HYDRA outperforms influence functions in accurately estimating data contribution and detecting noisy data labels.
arXiv Detail & Related papers (2021-02-04T10:00:13Z) - DeepRite: Deep Recurrent Inverse TreatmEnt Weighting for Adjusting
Time-varying Confounding in Modern Longitudinal Observational Data [68.29870617697532]
We propose Deep Recurrent Inverse TreatmEnt weighting (DeepRite) for time-varying confounding in longitudinal data.
DeepRite is shown to recover the ground truth from synthetic data, and estimate unbiased treatment effects from real data.
arXiv Detail & Related papers (2020-10-28T15:05:08Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.