Debiased Off-Policy Evaluation for Recommendation Systems
- URL: http://arxiv.org/abs/2002.08536v3
- Date: Mon, 2 Aug 2021 21:03:26 GMT
- Title: Debiased Off-Policy Evaluation for Recommendation Systems
- Authors: Yusuke Narita, Shota Yasui, Kohei Yata
- Abstract summary: A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure.
We develop an alternative method, which predicts the performance of algorithms given historical data.
Our method produces smaller mean squared errors than state-of-the-art methods.
- Score: 8.63711086812655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient methods to evaluate new algorithms are critical for improving
interactive bandit and reinforcement learning systems such as recommendation
systems. A/B tests are reliable, but are time- and money-consuming, and entail
a risk of failure. In this paper, we develop an alternative method, which
predicts the performance of algorithms given historical data that may have been
generated by a different algorithm. Our estimator has the property that its
prediction converges in probability to the true performance of a counterfactual
algorithm at a rate of $\sqrt{N}$, as the sample size $N$ increases. We also
show a correct way to estimate the variance of our prediction, thus allowing
the analyst to quantify the uncertainty in the prediction. These properties
hold even when the analyst does not know which among a large number of
potentially important state variables are actually important. We validate our
method by a simulation experiment about reinforcement learning. We finally
apply it to improve advertisement design by a major advertisement company. We
find that our method produces smaller mean squared errors than state-of-the-art
methods.
Related papers
- Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Distribution-free risk assessment of regression-based machine learning
algorithms [6.507711025292814]
We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction.
We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability.
arXiv Detail & Related papers (2023-10-05T13:57:24Z) - Uncertainty Estimation based on Geometric Separation [13.588210692213568]
In machine learning, accurately predicting the probability that a specific input is correct is crucial for risk management.
We put forward a novel geometric-based approach for improving uncertainty estimations in machine learning models.
arXiv Detail & Related papers (2023-01-11T13:19:24Z) - A Geometric Method for Improved Uncertainty Estimation in Real-time [13.588210692213568]
Post-hoc model calibrations can improve models' uncertainty estimations without the need for retraining.
Our work puts forward a geometric-based approach for uncertainty estimation.
We show that our method yields better uncertainty estimations than recently proposed approaches.
arXiv Detail & Related papers (2022-06-23T09:18:05Z) - Risk Preferences of Learning Algorithms [0.0]
We show that a widely used learning algorithm, $varepsilon$-Greedy, exhibits emergent risk aversion.
We discuss two methods to correct this bias.
arXiv Detail & Related papers (2022-05-10T01:30:24Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem.
We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting.
We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - Scalable Approximate Inference and Some Applications [2.6541211006790983]
In this thesis, we propose a new framework for approximate inference.
Our proposed four algorithms are motivated by the recent computational progress of Stein's method.
Results on simulated and real datasets indicate the statistical efficiency and wide applicability of our algorithm.
arXiv Detail & Related papers (2020-03-07T04:33:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.