Information Theory Inspired Pattern Analysis for Time-series Data
- URL: http://arxiv.org/abs/2302.11654v2
- Date: Fri, 28 Apr 2023 13:57:36 GMT
- Title: Information Theory Inspired Pattern Analysis for Time-series Data
- Authors: Yushan Huang, Yuchen Zhao, Alexander Capstick, Francesca Palermo,
Hamed Haddadi, Payam Barnaghi
- Abstract summary: We propose a highly generalizable method that uses information theory-based features to identify and learn from patterns in time-series data.
For applications with state transitions, features are developed based on Shannon's entropy of Markov chains, entropy rates of Markov chains, and von Neumann entropy of Markov chains.
The results show the proposed information theory-based features improve the recall rate, F1 score, and accuracy on average by up to 23.01% compared with the baseline models.
- Score: 60.86880787242563
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Current methods for pattern analysis in time series mainly rely on
statistical features or probabilistic learning and inference methods to
identify patterns and trends in the data. Such methods do not generalize well
when applied to multivariate, multi-source, state-varying, and noisy
time-series data. To address these issues, we propose a highly generalizable
method that uses information theory-based features to identify and learn from
patterns in multivariate time-series data. To demonstrate the proposed
approach, we analyze pattern changes in human activity data. For applications
with stochastic state transitions, features are developed based on Shannon's
entropy of Markov chains, entropy rates of Markov chains, entropy production of
Markov chains, and von Neumann entropy of Markov chains. For applications where
state modeling is not applicable, we utilize five entropy variants, including
approximate entropy, increment entropy, dispersion entropy, phase entropy, and
slope entropy. The results show the proposed information theory-based features
improve the recall rate, F1 score, and accuracy on average by up to 23.01%
compared with the baseline models and a simpler model structure, with an
average reduction of 18.75 times in the number of model parameters.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - High Dimensional Time Series Regression Models: Applications to
Statistical Learning Methods [0.0]
These lecture notes provide an overview of existing methodologies and recent developments for estimation and inference with high dimensional time series regression models.
First, we present main limit theory results for high dimensional dependent data which is relevant to covariance matrix structures as well as to dependent time series sequences.
arXiv Detail & Related papers (2023-08-27T15:53:31Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Wasserstein multivariate auto-regressive models for modeling distributional time series [0.0]
We propose a new auto-regressive model for the statistical analysis of multivariate distributional time series.
Results on the existence, uniqueness and stationarity of the solution of such a model are provided.
To shed some light on the benefits of our approach for real data analysis, we also apply this methodology to a data set made of observations from age distribution in different countries.
arXiv Detail & Related papers (2022-07-12T10:18:36Z) - Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems [26.744964200606784]
We propose a novel inference algorithm utilizing a Markov Chain Monte Carlo approach.
The presented Gibbs sampler allows to efficiently obtain samples from the exact continuous-time posterior processes.
arXiv Detail & Related papers (2022-05-18T09:03:00Z) - Learning Summary Statistics for Bayesian Inference with Autoencoders [58.720142291102135]
We use the inner dimension of deep neural network based Autoencoders as summary statistics.
To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information that has been used to generate the training data.
arXiv Detail & Related papers (2022-01-28T12:00:31Z) - Markov Modeling of Time-Series Data using Symbolic Analysis [8.522582405896653]
We will review the different techniques for discretization and memory estimation for discrete processes.
We will present some results from literature on partitioning from dynamical systems theory and order estimation using concepts of information theory and statistical learning.
arXiv Detail & Related papers (2021-03-20T20:31:21Z) - Time Adaptive Gaussian Model [0.913755431537592]
Our model is a generalization of state-of-the-art methods for the inference of temporal graphical models.
It performs pattern recognition by clustering data points in time; and, it finds probabilistic (and possibly causal) relationships among the observed variables.
arXiv Detail & Related papers (2021-02-02T00:28:14Z) - Variational Hyper RNN for Sequence Modeling [69.0659591456772]
We propose a novel probabilistic sequence model that excels at capturing high variability in time series data.
Our method uses temporal latent variables to capture information about the underlying data pattern.
The efficacy of the proposed method is demonstrated on a range of synthetic and real-world sequential data.
arXiv Detail & Related papers (2020-02-24T19:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.