Topology-based Clusterwise Regression for User Segmentation and Demand
Forecasting
- URL: http://arxiv.org/abs/2009.03661v1
- Date: Tue, 8 Sep 2020 12:10:10 GMT
- Title: Topology-based Clusterwise Regression for User Segmentation and Demand
Forecasting
- Authors: Rodrigo Rivera-Castro, Aleksandr Pletnev, Polina Pilyugina, Grecia
Diaz, Ivan Nazarov, Wanyi Zhu and Evgeny Burnaev
- Abstract summary: Using a public and a novel proprietary data set of commercial data, this research shows that the proposed system enables analysts to both cluster their user base and plan demand at a granular level.
This work seeks to introduce TDA-based clustering of time series and clusterwise regression with matrix factorization methods as viable tools for the practitioner.
- Score: 63.78344280962136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Topological Data Analysis (TDA) is a recent approach to analyze data sets
from the perspective of their topological structure. Its use for time series
data has been limited. In this work, a system developed for a leading provider
of cloud computing combining both user segmentation and demand forecasting is
presented. It consists of a TDA-based clustering method for time series
inspired by a popular managerial framework for customer segmentation and
extended to the case of clusterwise regression using matrix factorization
methods to forecast demand. Increasing customer loyalty and producing accurate
forecasts remain active topics of discussion both for researchers and managers.
Using a public and a novel proprietary data set of commercial data, this
research shows that the proposed system enables analysts to both cluster their
user base and plan demand at a granular level with significantly higher
accuracy than a state of the art baseline. This work thus seeks to introduce
TDA-based clustering of time series and clusterwise regression with matrix
factorization methods as viable tools for the practitioner.
Related papers
- Agentic Retrieval-Augmented Generation for Time Series Analysis [0.0]
We propose a novel agentic Retrieval-Augmented Generation framework for time series analysis.
Our proposed modular multi-agent RAG approach offers flexibility and achieves more state-of-the-art performance across major time series tasks.
arXiv Detail & Related papers (2024-08-18T11:47:55Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - A Machine Learning-Based Framework for Clustering Residential
Electricity Load Profiles to Enhance Demand Response Programs [0.0]
We present a novel machine learning based framework in order to achieve optimal load profiling through a real case study.
In this paper, we present a novel machine learning based framework in order to achieve optimal load profiling through a real case study.
arXiv Detail & Related papers (2023-10-31T11:23:26Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - Deep Goal-Oriented Clustering [25.383738675621505]
Clustering and prediction are two primary tasks in the fields of unsupervised and supervised learning.
We introduce Deep Goal-Oriented Clustering (DGC), a probabilistic framework that clusters the data by jointly using supervision via side-information.
We show the effectiveness of our model on a range of datasets by achieving prediction accuracies comparable to the state-of-the-art.
arXiv Detail & Related papers (2020-06-07T20:41:08Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z) - Adaptive Discrete Smoothing for High-Dimensional and Nonlinear Panel
Data [4.550919471480445]
We develop a data-driven smoothing technique for high-dimensional and non-linear panel data models.
The weights are determined by a data-driven way and depend on the similarity between the corresponding functions.
We conduct a simulation study which shows that the prediction can be greatly improved by using our estimator.
arXiv Detail & Related papers (2019-12-30T09:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.