Related papers: Imputing Missing Observations with Time Sliced Synthetic Minority Oversampling Technique

Imputing Missing Observations with Time Sliced Synthetic Minority Oversampling Technique

URL: http://arxiv.org/abs/2201.05634v1
Date: Fri, 14 Jan 2022 19:23:24 GMT
Title: Imputing Missing Observations with Time Sliced Synthetic Minority Oversampling Technique
Authors: Andrew Baumgartner, Sevda Molani, Qi Wei and Jennifer Hadlock
Abstract summary: We present a simple yet novel time series imputation technique with the goal of constructing an irregular time series that is uniform across every sample in a data set. We fix a grid defined by the midpoints of non-overlapping bins (dubbed "slices") of observation times and ensure that each sample has values for all of the features at that given time. This allows one to both impute fully missing observations to allow uniform time series classification across the entire data and, in special cases, to impute individually missing features.
Score: 0.3973560285628012
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We present a simple yet novel time series imputation technique with the goal of constructing an irregular time series that is uniform across every sample in a data set. Specifically, we fix a grid defined by the midpoints of non-overlapping bins (dubbed "slices") of observation times and ensure that each sample has values for all of the features at that given time. This allows one to both impute fully missing observations to allow uniform time series classification across the entire data and, in special cases, to impute individually missing features. To do so, we slightly generalize the well-known class imbalance algorithm SMOTE \cite{smote} to allow component wise nearest neighbor interpolation that preserves correlations when there are no missing features. We visualize the method in the simplified setting of 2-dimensional uncoupled harmonic oscillators. Next, we use tSMOTE to train an Encoder/Decoder long-short term memory (LSTM) model with Logistic Regression for predicting and classifying distinct trajectories of different 2D oscillators. After illustrating the the utility of tSMOTE in this context, we use the same architecture to train a clinical model for COVID-19 disease severity on an imputed data set. Our experiments show an improvement over standard mean and median imputation techniques by allowing a wider class of patient trajectories to be recognized by the model, as well as improvement over aggregated classification models.

Related papers

LSCD: Lomb-Scargle Conditioned Diffusion for Time series Imputation [55.800319453296886]
Time series with missing or irregularly sampled data are a persistent challenge in machine learning.<n>We introduce a different Lombiable--Scargle layer that enables a reliable computation of the power spectrum of irregularly sampled data.
arXiv Detail & Related papers (2025-06-20T14:48:42Z)
MFRS: A Multi-Frequency Reference Series Approach to Scalable and Accurate Time-Series Forecasting [51.94256702463408]
Time series predictability is derived from periodic characteristics at different frequencies. We propose a novel time series forecasting method based on multi-frequency reference series correlation analysis. Experiments on major open and synthetic datasets show state-of-the-art performance.
arXiv Detail & Related papers (2025-03-11T11:40:14Z)
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification [13.110156202816112]
We propose a novel multi-view approach to capture patterns with properties like shift equivariance. Our method integrates diverse features, including spectral, temporal, local, and global features, to obtain rich, complementary contexts for TSC. Our approach achieves average accuracy improvements of 4.01-6.45% and 7.93% respectively, over leading TSC models.
arXiv Detail & Related papers (2024-06-06T18:05:10Z)
Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies. Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z)
Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations. In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z)
TSI-GAN: Unsupervised Time Series Anomaly Detection using Convolutional Cycle-Consistent Generative Adversarial Networks [2.4469484645516837]
Anomaly detection is widely used in network intrusion detection, autonomous driving, medical diagnosis, credit card frauds, etc. This paper proposes TSI-GAN, an unsupervised anomaly detection model for time-series that can learn complex temporal patterns automatically. We evaluate TSI-GAN using 250 well-curated and harder-than-usual datasets and compare with 8 state-of-the-art baseline methods.
arXiv Detail & Related papers (2023-03-22T23:24:47Z)
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
Tripletformer for Probabilistic Interpolation of Irregularly sampled Time Series [6.579888565581481]
We present a novel encoder-decoder architecture called "Tripletformer" for probabilistic of irregularly sampled time series with missing values. This attention-based model operates on sets of observations, where each element is composed of a triple time, channel, and value. Results indicate an improvement in negative loglikelihood error by up to 32% on real-world datasets and 85% on synthetic datasets.
arXiv Detail & Related papers (2022-10-05T08:31:05Z)
Stacked Residuals of Dynamic Layers for Time Series Anomaly Detection [0.0]
We present an end-to-end differentiable neural network architecture to perform anomaly detection in multivariate time series. The architecture is a cascade of dynamical systems designed to separate linearly predictable components of the signal. The anomaly detector exploits the temporal structure of the prediction residuals to detect both isolated point anomalies and set-point changes.
arXiv Detail & Related papers (2022-02-25T01:50:22Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
Learning from Irregularly-Sampled Time Series: A Missing Data Perspective [18.493394650508044]
Irregularly-sampled time series occur in many domains including healthcare. We model irregularly-sampled time series data as a sequence of index-value pairs sampled from a continuous but unobserved function. We propose learning methods for this framework based on variational autoencoders and generative adversarial networks.
arXiv Detail & Related papers (2020-08-17T20:01:55Z)
Unsupervised Online Anomaly Detection On Irregularly Sampled Or Missing Valued Time-Series Data Using LSTM Networks [0.0]
We study anomaly detection and introduce an algorithm that processes variable length, irregularly sampled sequences or sequences with missing values. Our algorithm is fully unsupervised, however, can be readily extended to supervised or semisupervised cases.
arXiv Detail & Related papers (2020-05-25T09:41:04Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.