Bayesian Sample Size Prediction for Online Activity
- URL: http://arxiv.org/abs/2111.12157v1
- Date: Tue, 23 Nov 2021 21:16:17 GMT
- Title: Bayesian Sample Size Prediction for Online Activity
- Authors: Thomas Richardson, Yu Liu, James McQueen, Doug Hains
- Abstract summary: It is useful to predict the number of individuals in some population who will initiate a particular activity during a given period.
We present a simple but novel Bayesian method for predicting the number of additional individuals who will subsequently participate during a subsequent period.
- Score: 5.685803049176655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many contexts it is useful to predict the number of individuals in some
population who will initiate a particular activity during a given period. For
example, the number of users who will install a software update, the number of
customers who will use a new feature on a website or who will participate in an
A/B test. In practical settings, there is heterogeneity amongst individuals
with regard to the distribution of time until they will initiate. For these
reasons it is inappropriate to assume that the number of new individuals
observed on successive days will be identically distributed. Given observations
on the number of unique users participating in an initial period, we present a
simple but novel Bayesian method for predicting the number of additional
individuals who will subsequently participate during a subsequent period. We
illustrate the performance of the method in predicting sample size in online
experimentation.
Related papers
- DOTA: Distributional Test-Time Adaptation of Vision-Language Models [52.98590762456236]
Training-free test-time dynamic adapter (TDA) is a promising approach to address this issue.
We propose a simple yet effective method for DistributiOnal Test-time Adaptation (Dota)
Dota continually estimates the distributions of test samples, allowing the model to continually adapt to the deployment environment.
arXiv Detail & Related papers (2024-09-28T15:03:28Z) - New User Event Prediction Through the Lens of Causal Inference [20.676353189313737]
We propose a novel discrete event prediction framework for new users.
Our method offers an unbiased prediction for new users without needing to know their categories.
We demonstrate the superior performance of the proposed framework with a numerical simulation study and two real-world applications.
arXiv Detail & Related papers (2024-07-08T05:35:54Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Tracking Changing Probabilities via Dynamic Learners [0.18648070031379424]
We develop sparse multiclass moving average techniques to respond to non-stationarities in a timely manner.
One technique is based on the exponentiated moving average (EMA) and another is based on queuing a few count snapshots.
arXiv Detail & Related papers (2024-02-15T17:48:58Z) - Improved prediction of future user activity in online A/B testing [9.824661943331119]
In online randomized experiments or A/B tests, accurate predictions of participant inclusion rates are of paramount importance.
We present a novel, straightforward, and scalable Bayesian nonparametric approach for predicting the rate at which individuals will be exposed to interventions.
arXiv Detail & Related papers (2024-02-05T17:44:21Z) - A Nonparametric Bayes Approach to Online Activity Prediction [11.934335703226404]
We propose a novel approach to predict the number of users that will be active in a given time period.
We derive closed-form expressions for the number of new users expected in a given period, and a simple Monte Carlo algorithm targeting the posterior distribution of the number of days needed to attain a desired number of users.
arXiv Detail & Related papers (2024-01-26T09:11:42Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - PinnerFormer: Sequence Modeling for User Representation at Pinterest [60.335384724891746]
We introduce PinnerFormer, a user representation trained to predict a user's future long-term engagement.
Unlike prior approaches, we adapt our modeling to a batch infrastructure via our new dense all-action loss.
We show that by doing so, we significantly close the gap between batch user embeddings that are generated once a day and realtime user embeddings generated whenever a user takes an action.
arXiv Detail & Related papers (2022-05-09T18:26:51Z) - Plinko: A Theory-Free Behavioral Measure of Priors for Statistical
Learning and Mental Model Updating [62.997667081978825]
We present three experiments using "Plinko", a behavioral task in which participants estimate distributions of ball drops over all available outcomes.
We show that participant priors cluster around prototypical probability distributions and that prior cluster membership may indicate learning ability.
We verify that individual participant priors are reliable representations and that learning is not impeded when faced with a physically implausible ball drop distribution.
arXiv Detail & Related papers (2021-07-23T22:27:30Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z) - Characterizing Structural Regularities of Labeled Data in
Overparameterized Models [45.956614301397885]
Deep neural networks can generalize across instances that share common patterns or structures.
We analyze how individual instances are treated by a model via a consistency score.
We show examples of potential applications to the analysis of deep-learning systems.
arXiv Detail & Related papers (2020-02-08T17:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.