Profiling Television Watching Behaviour Using Bayesian Hierarchical
Joint Models for Time-to-Event and Count Data
- URL: http://arxiv.org/abs/2209.02626v1
- Date: Tue, 6 Sep 2022 16:29:15 GMT
- Title: Profiling Television Watching Behaviour Using Bayesian Hierarchical
Joint Models for Time-to-Event and Count Data
- Authors: Rafael A. Moral, Zhi Chen, Shuai Zhang, Sally McClean, Gabriel R.
Palma, Brahim Allan, Ian Kegel
- Abstract summary: We propose a novel Bayesian hierarchical joint model that characterises customer profiles based on how many events take place within different television watching journeys.
The model drastically reduces the dimensionality of the data from thousands of observations per customer to 11 customer-level parameter estimates and random effects.
Our proposed methodology represents an efficient way of reducing the dimensionality of the data, while at the same time maintaining high descriptive and predictive capabilities.
- Score: 9.33192719245965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Customer churn prediction is a valuable task in many industries. In
telecommunications it presents great challenges, given the high dimensionality
of the data, and how difficult it is to identify underlying frustration
signatures, which may represent an important driver regarding future churn
behaviour. Here, we propose a novel Bayesian hierarchical joint model that is
able to characterise customer profiles based on how many events take place
within different television watching journeys, and how long it takes between
events. The model drastically reduces the dimensionality of the data from
thousands of observations per customer to 11 customer-level parameter estimates
and random effects. We test our methodology using data from 40 BT customers (20
active and 20 who eventually cancelled their subscription) whose TV watching
behaviours were recorded from October to December 2019, totalling approximately
half a million observations. Employing different machine learning techniques
using the parameter estimates and random effects from the Bayesian hierarchical
model as features yielded up to 92\% accuracy predicting churn, associated with
100\% true positive rates and false positive rates as low as 14\% on a
validation set. Our proposed methodology represents an efficient way of
reducing the dimensionality of the data, while at the same time maintaining
high descriptive and predictive capabilities. We provide code to implement the
Bayesian model at https://github.com/rafamoral/profiling_tv_watching_behaviour.
Related papers
- KODA: A Data-Driven Recursive Model for Time Series Forecasting and Data Assimilation using Koopman Operators [14.429071321401953]
We propose a Koopman operator-based approach that integrates forecasting and data assimilation in nonlinear dynamical systems.
In particular we use a Fourier domain filter to disentangle the data into a physical component whose dynamics can be accurately represented by a Koopman operator.
We show that KODA outperforms existing state of the art methods on multiple time series benchmarks.
arXiv Detail & Related papers (2024-09-29T02:25:48Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - Distance Matters For Improving Performance Estimation Under Covariate
Shift [18.68533487971233]
Under dataset shifts, confidence scores may become ill-calibrated if samples are too far from the training distribution.
We show that taking into account taking into account distances of test samples to their expected training distribution can significantly improve performance estimation.
We demonstrate the effectiveness of this method on 13 image classification tasks, across a wide-range of natural and synthetic distribution shifts.
arXiv Detail & Related papers (2023-08-14T15:49:19Z) - Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value [17.340091573913316]
We propose Data-OOB, a new data valuation method for a bagging model that utilizes the out-of-bag estimate.
Data-OOB takes less than 2.25 hours on a single CPU processor when there are $106$ samples to evaluate and the input dimension is 100.
We demonstrate that the proposed method significantly outperforms existing state-of-the-art data valuation methods in identifying mislabeled data and finding a set of helpful (or harmful) data points.
arXiv Detail & Related papers (2023-04-16T08:03:58Z) - A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance.
We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset.
We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z) - The future is different: Large pre-trained language models fail in
prediction tasks [2.9005223064604078]
We introduce four new REDDIT datasets, namely the WALLSTREETBETS, ASKSCIENCE, THE DONALD, and POLITICS sub-reddits.
First, we empirically demonstrate that LPLM can display average performance drops of about 88% when predicting the popularity of future posts from sub-reddits whose topic distribution changes with time.
We then introduce a simple methodology that leverages neural variational dynamic topic models and attention mechanisms to infer temporal language model representations for regression tasks.
arXiv Detail & Related papers (2022-11-01T11:01:36Z) - FitVid: Overfitting in Pixel-Level Video Prediction [117.59339756506142]
We introduce a new architecture, named FitVid, which is capable of severe overfitting on the common benchmarks.
FitVid outperforms the current state-of-the-art models across four different video prediction benchmarks on four different metrics.
arXiv Detail & Related papers (2021-06-24T17:20:21Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Variational Bayes survival analysis for unemployment modelling [0.0]
The model is evaluated on a time-to-employment data set spanning from 2011 to 2020 provided by the Slovenian public employment service.
Similar models could be applied to other questions with multi-dimensional, high-cardinality categorical data including censored records.
arXiv Detail & Related papers (2021-02-03T21:06:54Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z) - A Multi-Channel Neural Graphical Event Model with Negative Evidence [76.51278722190607]
Event datasets are sequences of events of various types occurring irregularly over the time-line.
We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions.
arXiv Detail & Related papers (2020-02-21T23:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.