Related papers: Evaluation of k-means time series clustering based on z-normalization and NP-Free

Evaluation of k-means time series clustering based on z-normalization and NP-Free

URL: http://arxiv.org/abs/2401.15773v1
Date: Sun, 28 Jan 2024 21:23:13 GMT
Title: Evaluation of k-means time series clustering based on z-normalization and NP-Free
Authors: Ming-Chang Lee, Jia-Chun Lin, and Volker Stolz
Abstract summary: This paper conducts a thorough performance evaluation of k-means time series clustering on real-world open-source time series datasets. The evaluation focuses on two distinct normalization techniques: z-normalization and NP-Free. The primary objective of this paper is to assess the impact of these two normalization techniques on k-means time series clustering in terms of its clustering quality.
Score: 0.5898893619901381
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the widespread use of k-means time series clustering in various domains, there exists a gap in the literature regarding its comprehensive evaluation with different time series normalization approaches. This paper seeks to fill this gap by conducting a thorough performance evaluation of k-means time series clustering on real-world open-source time series datasets. The evaluation focuses on two distinct normalization techniques: z-normalization and NP-Free. The former is one of the most commonly used normalization approach for time series. The latter is a real-time time series representation approach, which can serve as a time series normalization approach. The primary objective of this paper is to assess the impact of these two normalization techniques on k-means time series clustering in terms of its clustering quality. The experiments employ the silhouette score, a well-established metric for evaluating the quality of clusters in a dataset. By systematically investigating the performance of k-means time series clustering with these two normalization techniques, this paper addresses the current gap in k-means time series clustering evaluation and contributes valuable insights to the development of time series clustering.

Related papers

K*-Means: A Parameter-free Clustering Algorithm [55.20132267309382]
k*-means is a novel clustering algorithm that eliminates the need to set k or any other parameters.<n>It uses the minimum description length principle to automatically determine the optimal number of clusters, k*, by splitting and merging clusters.<n>We prove that k*-means is guaranteed to converge and demonstrate experimentally that it significantly outperforms existing methods in scenarios where k is unknown.
arXiv Detail & Related papers (2025-05-17T08:41:07Z)
Ranked differences Pearson correlation dissimilarity with an application to electricity users time series clustering [0.0]
Time series clustering is used in applications such as healthcare, finance, economics, energy, and climate science.<n>We propose a new dissimilarity measure called ranked Pearson correlation dissimilarity (RDPC), which combines a weighted average of a specified fraction of the largest element-wise differences with the well-known Pearson correlation dissimilarity.
arXiv Detail & Related papers (2025-05-04T16:25:12Z)
Adaptive Classification of Interval-Valued Time Series [0.0]
We propose an adaptive approach for interval-valued time series classification. We transform interval-valued time series representations into images based on point-valued time series imaging methods.
arXiv Detail & Related papers (2025-04-04T09:52:40Z)
ABCDE: Application-Based Cluster Diff Evals [49.1574468325115]
It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items. The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, ABCDE samples questions for judgement on the basis of the actual diffs between the clusterings.
arXiv Detail & Related papers (2024-07-31T08:29:35Z)
Clustering of timed sequences -- Application to the analysis of care pathways [0.0]
Revealing typical care pathways can be achieved through clustering. The difficulty in clustering care pathways, represented by sequences of timestamped events, lies in defining a semantically appropriate metric and clustering algorithms.
arXiv Detail & Related papers (2024-04-23T07:16:13Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z)
Fuzzy clustering of ordinal time series based on two novel distances with economic applications [0.12891210250935145]
Two novel distances between ordinal time series are introduced and used to construct fuzzy clustering procedures. The resulting clustering algorithms are computationally efficient and able to group series generated from similar processes. Two specific applications involving economic time series illustrate the usefulness of the proposed approaches.
arXiv Detail & Related papers (2023-04-24T16:39:22Z)
Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data. Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step. When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z)
Novel Features for Time Series Analysis: A Complex Networks Approach [62.997667081978825]
Time series data are ubiquitous in several domains as climate, economics and health care. Recent conceptual approach relies on time series mapping to complex networks. Network analysis can be used to characterize different types of time series.
arXiv Detail & Related papers (2021-10-11T13:46:28Z)
You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation. We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one. By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z)
From Time Series to Euclidean Spaces: On Spatial Transformations for Temporal Clustering [5.220940151628734]
We show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data. We propose a novel approach to temporal clustering, in which we transform the input time series into a distance-based projected representation.
arXiv Detail & Related papers (2020-10-02T09:08:16Z)
Autoencoder-based time series clustering with energy applications [0.0]
Time series clustering is a challenging task due to the specific nature of the data. In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering.
arXiv Detail & Related papers (2020-02-10T10:04:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.