Distribution Agnostic Symbolic Representations for Time Series
Dimensionality Reduction and Online Anomaly Detection
- URL: http://arxiv.org/abs/2105.09592v1
- Date: Thu, 20 May 2021 08:35:50 GMT
- Title: Distribution Agnostic Symbolic Representations for Time Series
Dimensionality Reduction and Online Anomaly Detection
- Authors: Konstantinos Bountrogiannis, George Tzagkarakis, Panagiotis Tsakalides
- Abstract summary: This paper proposes two novel data-driven SAX-based symbolic representations, distinguished by their discretization steps.
The proposed representations possess all the attractive properties of the conventional SAX method.
- Score: 8.00114449574708
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the importance of the lower bounding distances and the attractiveness
of symbolic representations, the family of symbolic aggregate approximations
(SAX) has been used extensively for encoding time series data. However, typical
SAX-based methods rely on two restrictive assumptions; the Gaussian
distribution and equiprobable symbols. This paper proposes two novel
data-driven SAX-based symbolic representations, distinguished by their
discretization steps. The first representation, oriented for general data
compaction and indexing scenarios, is based on the combination of kernel
density estimation and Lloyd-Max quantization to minimize the information loss
and mean squared error in the discretization step. The second method, oriented
for high-level mining tasks, employs the Mean-Shift clustering method and is
shown to enhance anomaly detection in the lower-dimensional space. Besides, we
verify on a theoretical basis a previously observed phenomenon of the intrinsic
process that results in a lower than the expected variance of the intermediate
piecewise aggregate approximation. This phenomenon causes an additional
information loss but can be avoided with a simple modification. The proposed
representations possess all the attractive properties of the conventional SAX
method. Furthermore, experimental evaluation on real-world datasets
demonstrates their superiority compared to the traditional SAX and an
alternative data-driven SAX variant.
Related papers
- Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Anomaly Detection Under Uncertainty Using Distributionally Robust
Optimization Approach [0.9217021281095907]
Anomaly detection is defined as the problem of finding data points that do not follow the patterns of the majority.
The one-class Support Vector Machines (SVM) method aims to find a decision boundary to distinguish between normal data points and anomalies.
A distributionally robust chance-constrained model is proposed in which the probability of misclassification is low.
arXiv Detail & Related papers (2023-12-03T06:13:22Z) - Fast Estimation of Bayesian State Space Models Using Amortized
Simulation-Based Inference [0.0]
This paper presents a fast algorithm for estimating hidden states of Bayesian state space models.
After pretraining, finding the posterior distribution for any dataset takes from hundredths to tenths of a second.
arXiv Detail & Related papers (2022-10-13T16:37:05Z) - Federated Representation Learning via Maximal Coding Rate Reduction [109.26332878050374]
We propose a methodology to learn low-dimensional representations from a dataset that is distributed among several clients.
Our proposed method, which we refer to as FLOW, utilizes MCR2 as the objective of choice, hence resulting in representations that are both between-class discriminative and within-class compressible.
arXiv Detail & Related papers (2022-10-01T15:43:51Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Hyperspectral Image Denoising Using Non-convex Local Low-rank and Sparse
Separation with Spatial-Spectral Total Variation Regularization [49.55649406434796]
We propose a novel non particular approach to robust principal component analysis for HSI denoising.
We develop accurate approximations to both rank and sparse components.
Experiments on both simulated and real HSIs demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-01-08T11:48:46Z) - Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms.
The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm.
As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z) - Manifold learning-based polynomial chaos expansions for high-dimensional
surrogate models [0.0]
We introduce a manifold learning-based method for uncertainty quantification (UQ) in describing systems.
The proposed method is able to achieve highly accurate approximations which ultimately lead to the significant acceleration of UQ tasks.
arXiv Detail & Related papers (2021-07-21T00:24:15Z) - Hierarchical regularization networks for sparsification based learning
on noisy datasets [0.0]
hierarchy follows from approximation spaces identified at successively finer scales.
For promoting model generalization at each scale, we also introduce a novel, projection based penalty operator across multiple dimension.
Results show the performance of the approach as a data reduction and modeling strategy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-06-09T18:32:24Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.