Ranked differences Pearson correlation dissimilarity with an application to electricity users time series clustering
- URL: http://arxiv.org/abs/2505.02173v2
- Date: Wed, 07 May 2025 15:39:00 GMT
- Title: Ranked differences Pearson correlation dissimilarity with an application to electricity users time series clustering
- Authors: Chutiphan Charoensuk, Nathakhun Wiroonsri,
- Abstract summary: Time series clustering is used in applications such as healthcare, finance, economics, energy, and climate science.<n>We propose a new dissimilarity measure called ranked Pearson correlation dissimilarity (RDPC), which combines a weighted average of a specified fraction of the largest element-wise differences with the well-known Pearson correlation dissimilarity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series clustering is an unsupervised learning method for classifying time series data into groups with similar behavior. It is used in applications such as healthcare, finance, economics, energy, and climate science. Several time series clustering methods have been introduced and used for over four decades. Most of them focus on measuring either Euclidean distances or association dissimilarities between time series. In this work, we propose a new dissimilarity measure called ranked Pearson correlation dissimilarity (RDPC), which combines a weighted average of a specified fraction of the largest element-wise differences with the well-known Pearson correlation dissimilarity. It is incorporated into hierarchical clustering. The performance is evaluated and compared with existing clustering algorithms. The results show that the RDPC algorithm outperforms others in complicated cases involving different seasonal patterns, trends, and peaks. Finally, we demonstrate our method by clustering a random sample of customers from a Thai electricity consumption time series dataset into seven groups with unique characteristics.
Related papers
- A system identification approach to clustering vector autoregressive time series [50.66782357329375]
Clustering time series based on their underlying dynamics is keeping attracting researchers due to its impacts on assisting complex system modelling.<n>Most current time series clustering methods handle only scalar time series, treat them as white noise, or rely on domain knowledge for high-quality feature construction.<n>Instead of relying on feature/metric construction, the system identification approach allows treating vector time series clustering by explicitly considering their underlying autoregressive dynamics.
arXiv Detail & Related papers (2025-05-20T14:31:44Z) - Evaluation of k-means time series clustering based on z-normalization
and NP-Free [0.5898893619901381]
This paper conducts a thorough performance evaluation of k-means time series clustering on real-world open-source time series datasets.
The evaluation focuses on two distinct normalization techniques: z-normalization and NP-Free.
The primary objective of this paper is to assess the impact of these two normalization techniques on k-means time series clustering in terms of its clustering quality.
arXiv Detail & Related papers (2024-01-28T21:23:13Z) - Fuzzy clustering of circular time series based on a new dependence
measure with applications to wind data [2.845817138242963]
Time series clustering is an essential machine learning task with applications in many disciplines.
A distance between circular series is introduced and used to construct a clustering procedure.
A fuzzy approach is adopted, which enables the procedure to locate each series into several clusters with different membership degrees.
arXiv Detail & Related papers (2024-01-26T12:21:57Z) - Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series [45.76310830281876]
We propose Quantile Sub-Ensembles, a novel method to estimate uncertainty with ensemble of quantile-regression-based task networks.
Our method not only produces accurate imputations that is robust to high missing rates, but also is computationally efficient due to the fast training of its non-generative model.
arXiv Detail & Related papers (2023-12-03T05:52:30Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Fuzzy clustering of ordinal time series based on two novel distances
with economic applications [0.12891210250935145]
Two novel distances between ordinal time series are introduced and used to construct fuzzy clustering procedures.
The resulting clustering algorithms are computationally efficient and able to group series generated from similar processes.
Two specific applications involving economic time series illustrate the usefulness of the proposed approaches.
arXiv Detail & Related papers (2023-04-24T16:39:22Z) - Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining.
We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - Novel Features for Time Series Analysis: A Complex Networks Approach [62.997667081978825]
Time series data are ubiquitous in several domains as climate, economics and health care.
Recent conceptual approach relies on time series mapping to complex networks.
Network analysis can be used to characterize different types of time series.
arXiv Detail & Related papers (2021-10-11T13:46:28Z) - SummerTime: Variable-length Time SeriesSummarization with Applications
to PhysicalActivity Analysis [6.027126804548653]
textitSummerTime seeks to summarize globally time series signals.
It provides a fixed-length, robust summarization of the variable-length time series.
arXiv Detail & Related papers (2020-02-20T20:20:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.