Fast Classification of Large Time Series Datasets
- URL: http://arxiv.org/abs/2312.06029v1
- Date: Sun, 10 Dec 2023 22:56:09 GMT
- Title: Fast Classification of Large Time Series Datasets
- Authors: Muhammad Marwan Muhammad Fuad
- Abstract summary: Time series classification (TSC) is the most import task in time series mining.
With the ever increasing size of time series datasets, several traditional TSC methods are no longer efficient enough.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series classification (TSC) is the most import task in time series
mining as it has several applications in medicine, meteorology, finance cyber
security, and many others. With the ever increasing size of time series
datasets, several traditional TSC methods are no longer efficient enough to
perform this task on such very large datasets. Yet, most recent papers on TSC
focus mainly on accuracy by using methods that apply deep learning, for
instance, which require extensive computational resources that cannot be
applied efficiently to very large datasets. The method we introduce in this
paper focuses on these very large time series datasets with the main objective
being efficiency. We achieve this through a simplified representation of the
time series. This in turn is enhanced by a distance measure that considers only
some of the values of the represented time series. The result of this
combination is a very efficient representation method for TSC. This has been
tested experimentally against another time series method that is particularly
popular for its efficiency. The experiments show that our method is not only 4
times faster, on average, but it is also superior in terms of classification
accuracy, as it gives better results on 24 out of the 29 tested time series
datasets. .
Related papers
- Chronos: Learning the Language of Time Series [79.38691251254173]
Chronos is a framework for pretrained probabilistic time series models.
We show that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks.
arXiv Detail & Related papers (2024-03-12T16:53:54Z) - A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data.
We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Scalable Batch Acquisition for Deep Bayesian Active Learning [70.68403899432198]
In deep active learning, it is important to choose multiple examples to markup at each step.
Existing solutions to this problem, such as BatchBALD, have significant limitations in selecting a large number of examples.
We present the Large BatchBALD algorithm, which aims to achieve comparable quality while being more computationally efficient.
arXiv Detail & Related papers (2023-01-13T11:45:17Z) - Enhancing Transformer Efficiency for Multivariate Time Series
Classification [12.128991867050487]
We propose a methodology to investigate the relationship between model efficiency and accuracy, as well as its complexity.
Comprehensive experiments on benchmark MTS datasets illustrate the effectiveness of our method.
arXiv Detail & Related papers (2022-03-28T03:25:19Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - The FreshPRINCE: A Simple Transformation Based Pipeline Time Series
Classifier [0.0]
We look at whether the complexity of the algorithms considered state of the art is really necessary.
Many times the first approach suggested is a simple pipeline of summary statistics or other time series feature extraction approaches.
We test these approaches on the UCR time series dataset archive, looking to see if TSC literature has overlooked the effectiveness of these approaches.
arXiv Detail & Related papers (2022-01-28T11:23:58Z) - Robust Augmentation for Multivariate Time Series Classification [20.38907456958682]
We show that the simple methods of cutout, cutmix, mixup, and window warp improve the robustness and overall performance.
We show that the InceptionTime network with augmentation improves accuracy by 1% to 45% in 18 different datasets.
arXiv Detail & Related papers (2022-01-27T18:57:49Z) - Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining.
We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z) - TSAX is Trending [0.0]
Symbolic Aggregate approXimation (SAX) is one of the most popular representation methods of time series data.
We present a new modification of SAX that only adds minimal complexity to SAX, but substantially improves its performance in time series classification.
arXiv Detail & Related papers (2021-12-24T02:34:50Z) - Interpretable Time Series Classification using Linear Models and
Multi-resolution Multi-domain Symbolic Representations [6.6147550436077776]
We propose new time series classification algorithms to address gaps in current approaches.
Our approach is based on symbolic representations of time series, efficient sequence mining algorithms and linear classification models.
Our models are as accurate as deep learning models but are more efficient regarding running time and memory, can work with variable-length time series and can be interpreted by highlighting the discriminative symbolic features on the original time series.
arXiv Detail & Related papers (2020-05-31T15:32:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.