Evaluation of k-means time series clustering based on z-normalization
and NP-Free
- URL: http://arxiv.org/abs/2401.15773v1
- Date: Sun, 28 Jan 2024 21:23:13 GMT
- Title: Evaluation of k-means time series clustering based on z-normalization
and NP-Free
- Authors: Ming-Chang Lee, Jia-Chun Lin, and Volker Stolz
- Abstract summary: This paper conducts a thorough performance evaluation of k-means time series clustering on real-world open-source time series datasets.
The evaluation focuses on two distinct normalization techniques: z-normalization and NP-Free.
The primary objective of this paper is to assess the impact of these two normalization techniques on k-means time series clustering in terms of its clustering quality.
- Score: 0.5898893619901381
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the widespread use of k-means time series clustering in various
domains, there exists a gap in the literature regarding its comprehensive
evaluation with different time series normalization approaches. This paper
seeks to fill this gap by conducting a thorough performance evaluation of
k-means time series clustering on real-world open-source time series datasets.
The evaluation focuses on two distinct normalization techniques:
z-normalization and NP-Free. The former is one of the most commonly used
normalization approach for time series. The latter is a real-time time series
representation approach, which can serve as a time series normalization
approach. The primary objective of this paper is to assess the impact of these
two normalization techniques on k-means time series clustering in terms of its
clustering quality. The experiments employ the silhouette score, a
well-established metric for evaluating the quality of clusters in a dataset. By
systematically investigating the performance of k-means time series clustering
with these two normalization techniques, this paper addresses the current gap
in k-means time series clustering evaluation and contributes valuable insights
to the development of time series clustering.
Related papers
- ABCDE: Application-Based Cluster Diff Evals [49.1574468325115]
It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items.
The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, ABCDE samples questions for judgement on the basis of the actual diffs between the clusterings.
arXiv Detail & Related papers (2024-07-31T08:29:35Z) - Clustering of timed sequences -- Application to the analysis of care pathways [0.0]
Revealing typical care pathways can be achieved through clustering.
The difficulty in clustering care pathways, represented by sequences of timestamped events, lies in defining a semantically appropriate metric and clustering algorithms.
arXiv Detail & Related papers (2024-04-23T07:16:13Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Fuzzy clustering of ordinal time series based on two novel distances
with economic applications [0.12891210250935145]
Two novel distances between ordinal time series are introduced and used to construct fuzzy clustering procedures.
The resulting clustering algorithms are computationally efficient and able to group series generated from similar processes.
Two specific applications involving economic time series illustrate the usefulness of the proposed approaches.
arXiv Detail & Related papers (2023-04-24T16:39:22Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - Novel Features for Time Series Analysis: A Complex Networks Approach [62.997667081978825]
Time series data are ubiquitous in several domains as climate, economics and health care.
Recent conceptual approach relies on time series mapping to complex networks.
Network analysis can be used to characterize different types of time series.
arXiv Detail & Related papers (2021-10-11T13:46:28Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - From Time Series to Euclidean Spaces: On Spatial Transformations for
Temporal Clustering [5.220940151628734]
We show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data.
We propose a novel approach to temporal clustering, in which we transform the input time series into a distance-based projected representation.
arXiv Detail & Related papers (2020-10-02T09:08:16Z) - Autoencoder-based time series clustering with energy applications [0.0]
Time series clustering is a challenging task due to the specific nature of the data.
In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering.
arXiv Detail & Related papers (2020-02-10T10:04:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.