Related papers: An efficient aggregation method for the symbolic representation of temporal data

An efficient aggregation method for the symbolic representation of temporal data

URL: http://arxiv.org/abs/2201.05697v1
Date: Fri, 14 Jan 2022 22:51:24 GMT
Title: An efficient aggregation method for the symbolic representation of temporal data
Authors: Xinye Chen and Stefan G\"uttel
Abstract summary: We present a new variant of the adaptive Brownian bridge-based aggregation (ABBA) method, called fABBA. This variant utilizes a new aggregation approach tailored to the piecewise representation of time series. In contrast to the original method, the new approach does not require the number of time series symbols to be specified in advance.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Symbolic representations are a useful tool for the dimension reduction of temporal data, allowing for the efficient storage of and information retrieval from time series. They can also enhance the training of machine learning algorithms on time series data through noise reduction and reduced sensitivity to hyperparameters. The adaptive Brownian bridge-based aggregation (ABBA) method is one such effective and robust symbolic representation, demonstrated to accurately capture important trends and shapes in time series. However, in its current form the method struggles to process very large time series. Here we present a new variant of the ABBA method, called fABBA. This variant utilizes a new aggregation approach tailored to the piecewise representation of time series. By replacing the k-means clustering used in ABBA with a sorting-based aggregation technique, and thereby avoiding repeated sum-of-squares error computations, the computational complexity is significantly reduced. In contrast to the original method, the new approach does not require the number of time series symbols to be specified in advance. Through extensive tests we demonstrate that the new method significantly outperforms ABBA with a considerable reduction in runtime while also outperforming the popular SAX and 1d-SAX representations in terms of reconstruction accuracy. We further demonstrate that fABBA can compress other data types such as images.

Related papers

A system identification approach to clustering vector autoregressive time series [50.66782357329375]
Clustering time series based on their underlying dynamics is keeping attracting researchers due to its impacts on assisting complex system modelling.<n>Most current time series clustering methods handle only scalar time series, treat them as white noise, or rely on domain knowledge for high-quality feature construction.<n>Instead of relying on feature/metric construction, the system identification approach allows treating vector time series clustering by explicitly considering their underlying autoregressive dynamics.
arXiv Detail & Related papers (2025-05-20T14:31:44Z)
LLM-ABBA: Understanding time series via symbolic approximation [0.28675177318965045]
We introduce a method, called LLM-ABBA, that integrates ABBA into large language models for various downstream time series tasks. By symbolizing time series, LLM-ABBA compares favorably to the recent state-of-the-art (SOTA) in UCR and three medical time series classification tasks.
arXiv Detail & Related papers (2024-11-27T16:48:24Z)
Quantized symbolic time series approximation [0.28675177318965045]
We present a new quantization-based ABBA symbolic approximation technique, QABBA. QABBA exhibits improved storage efficiency while retaining the original speed and accuracy of symbolic reconstruction. An application of QABBA with large language models (LLMs) for time series regression is also presented.
arXiv Detail & Related papers (2024-11-20T10:32:22Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
We propose an algorithm that enables fast and high-quality generation under arbitrary constraints. During inference, we can interchange between gradient updates computed on the noisy image and updates computed on the final, clean image. Our approach produces results that rival or surpass the state-of-the-art training-free inference approaches.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Decreasing the Computing Time of Bayesian Optimization using Generalizable Memory Pruning [56.334116591082896]
We show a wrapper of memory pruning and bounded optimization capable of being used with any surrogate model and acquisition function. Running BO on high-dimensional or massive data sets becomes intractable due to this time complexity. All model implementations are run on the MIT Supercloud state-of-the-art computing hardware.
arXiv Detail & Related papers (2023-09-08T14:05:56Z)
HyperTime: Implicit Neural Representation for Time Series [131.57172578210256]
Implicit neural representations (INRs) have recently emerged as a powerful tool that provides an accurate and resolution-independent encoding of data. In this paper, we analyze the representation of time series using INRs, comparing different activation functions in terms of reconstruction accuracy and training convergence speed. We propose a hypernetwork architecture that leverages INRs to learn a compressed latent representation of an entire time series dataset.
arXiv Detail & Related papers (2022-08-11T14:05:51Z)
COSTI: a New Classifier for Sequences of Temporal Intervals [0.0]
We develop a novel method for classification operating directly on sequences of temporal intervals. The proposed method remains at a high level of accuracy and obtains better performance while avoiding shortcomings connected to operating on transformed data.
arXiv Detail & Related papers (2022-04-28T12:55:06Z)
High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix. Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm. On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z)
Elastic Product Quantization for Time Series [19.839572576189187]
We propose the use of product quantization for efficient similarity-based comparison of time series under time warping. The proposed solution emerges as a highly efficient (both in terms of memory usage and time) replacement for elastic measures in time series applications.
arXiv Detail & Related papers (2022-01-04T09:23:06Z)
MrSQM: Fast Time Series Classification with Symbolic Representations [11.853438514668207]
MrSQM uses multiple symbolic representations and efficient sequence mining to extract important time series features. We study four feature selection approaches on symbolic sequences, ranging from fully supervised, to unsupervised and hybrids. Our experiments on 112 datasets of the UEA/UCR benchmark demonstrate that MrSQM can quickly extract useful features.
arXiv Detail & Related papers (2021-09-02T15:54:46Z)
SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory. We propose StreaMRAK - a streaming version of KRR. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z)
Covariance-Free Sparse Bayesian Learning [62.24008859844098]
We introduce a new SBL inference algorithm that avoids explicit inversions of the covariance matrix. Our method can be up to thousands of times faster than existing baselines. We showcase how our new algorithm enables SBL to tractably tackle high-dimensional signal recovery problems.
arXiv Detail & Related papers (2021-05-21T16:20:07Z)
ABBA: Adaptive Brownian bridge-based symbolic aggregation of time series [0.0]
A new symbolic representation of time called ABBA is introduced. It is based on an adaptive polygonal chain approximation of the time series into a sequence of seriess. We show that the reconstruction error of this representation can be modelled as a random walk with pinned start and end points.
arXiv Detail & Related papers (2020-03-27T15:30:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.