Efficient Forecasting of Large Scale Hierarchical Time Series via
Multilevel Clustering
- URL: http://arxiv.org/abs/2205.14104v1
- Date: Fri, 27 May 2022 17:13:05 GMT
- Title: Efficient Forecasting of Large Scale Hierarchical Time Series via
Multilevel Clustering
- Authors: Xing Han, Tongzheng Ren, Jing Hu, Joydeep Ghosh, Nhat Ho
- Abstract summary: We propose a novel approach to the problem of clustering hierarchically aggregated time-series data.
We first group time series at each aggregated level, while simultaneously leveraging local and global information.
- Score: 26.236569277576425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel approach to the problem of clustering hierarchically
aggregated time-series data, which has remained an understudied problem though
it has several commercial applications. We first group time series at each
aggregated level, while simultaneously leveraging local and global information.
The proposed method can cluster hierarchical time series (HTS) with different
lengths and structures. For common two-level hierarchies, we employ a combined
objective for local and global clustering over spaces of discrete probability
measures, using Wasserstein distance coupled with Soft-DTW divergence. For
multi-level hierarchies, we present a bottom-up procedure that progressively
leverages lower-level information for higher-level clustering. Our final goal
is to improve both the accuracy and speed of forecasts for a larger number of
HTS needed for a real-world application. To attain this goal, each time series
is first assigned the forecast for its cluster representative, which can be
considered as a "shrinkage prior" for the set of time series it represents.
Then this base forecast can be quickly fine-tuned to adjust to the specifics of
that time series. We empirically show that our method substantially improves
performance in terms of both speed and accuracy for large-scale forecasting
tasks involving much HTS.
Related papers
- PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - SOMTimeS: Self Organizing Maps for Time Series Clustering and its
Application to Serious Illness Conversations [3.2689702143620147]
We present a new DTW-based clustering method called SOMTimeS (a Self-Organizing Map for TIME Series)
It scales better and runs faster than other DTW-based clustering algorithms, and has similar performance accuracy.
We applied SOMtimeS to natural language conversation data collected as part of a large healthcare cohort study.
arXiv Detail & Related papers (2021-08-26T00:18:25Z) - Scalable Community Detection via Parallel Correlation Clustering [1.5644420658691407]
Graph clustering and community detection are central problems in modern data mining.
In this paper, we design scalable algorithms that achieve high quality when evaluated based on ground truth.
arXiv Detail & Related papers (2021-07-27T04:33:37Z) - Hierarchically Regularized Deep Forecasting [18.539846932184012]
We propose a new approach for hierarchical forecasting based on decomposing the time series along a global set of basis time series.
Unlike past methods, our approach is scalable at inference-time while preserving coherence among the time series forecasts.
arXiv Detail & Related papers (2021-06-14T17:38:16Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Hierarchical Clustering using Auto-encoded Compact Representation for
Time-series Analysis [8.660029077292346]
We propose a novel mechanism to identify the clusters combining learned compact representation of time-series, Auto Encoded Compact Sequence (AECS) and hierarchical clustering approach.
Our algorithm exploits Recurrent Neural Network (RNN) based under complete Sequence to Sequence(seq2seq) autoencoder and agglomerative hierarchical clustering.
arXiv Detail & Related papers (2021-01-11T08:03:57Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.