Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo
sampling
- URL: http://arxiv.org/abs/2205.06024v1
- Date: Thu, 12 May 2022 11:15:47 GMT
- Title: Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo
sampling
- Authors: Alexander Buchholz, Jan Malte Lichtenberg, Giuseppe Di Benedetto,
Yannik Stein, Vito Bellini, Matteo Ruffini
- Abstract summary: We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model.
We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.
- Score: 58.14878401145309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Plackett-Luce (PL) model is ubiquitous in learning-to-rank (LTR) because
it provides a useful and intuitive probabilistic model for sampling ranked
lists. Counterfactual offline evaluation and optimization of ranking metrics
are pivotal for using LTR methods in production. When adopting the PL model as
a ranking policy, both tasks require the computation of expectations with
respect to the model. These are usually approximated via Monte-Carlo (MC)
sampling, since the combinatorial scaling in the number of items to be ranked
makes their analytical computation intractable. Despite recent advances in
improving the computational efficiency of the sampling process via the Gumbel
top-k trick, the MC estimates can suffer from high variance. We develop a novel
approach to producing more sample-efficient estimators of expectations in the
PL model by combining the Gumbel top-k trick with quasi-Monte Carlo (QMC)
sampling, a well-established technique for variance reduction. We illustrate
our findings both theoretically and empirically using real-world recommendation
data from Amazon Music and the Yahoo learning-to-rank challenge.
Related papers
- Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Regression-aware Inference with LLMs [52.764328080398805]
We show that an inference strategy can be sub-optimal for common regression and scoring evaluation metrics.
We propose alternate inference strategies that estimate the Bayes-optimal solution for regression and scoring metrics in closed-form from sampled responses.
arXiv Detail & Related papers (2024-03-07T03:24:34Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Nearest Neighbour Score Estimators for Diffusion Generative Models [16.189734871742743]
We introduce a novel nearest neighbour score function estimator which utilizes multiple samples from the training set to dramatically decrease estimator variance.
In diffusion models, we show that our estimator can replace a learned network for probability-flow ODE integration, opening promising new avenues of future research.
arXiv Detail & Related papers (2024-02-12T19:27:30Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features [0.11184789007828977]
We focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor vector.
We design a novel Monte Carlo sampling algorithm that estimates them at a reduced complexity that depends linearly on the size of the background dataset.
arXiv Detail & Related papers (2023-03-17T19:17:06Z) - Efficient Propagation of Uncertainty via Reordering Monte Carlo Samples [0.7087237546722617]
Uncertainty propagation is a technique to determine model output uncertainties based on the uncertainty in its input variables.
In this work, we investigate the hypothesis that while all samples are useful on average, some samples must be more useful than others.
We introduce a methodology to adaptively reorder MC samples and show how it results in reduction of computational expense of UP processes.
arXiv Detail & Related papers (2023-02-09T21:28:15Z) - Principled Reinforcement Learning with Human Feedback from Pairwise or
$K$-wise Comparisons [79.98542868281473]
We provide a theoretical framework for Reinforcement Learning with Human Feedback (RLHF)
We show that when training a policy based on the learned reward model, MLE fails while a pessimistic MLE provides policies with improved performance under certain coverage assumptions.
arXiv Detail & Related papers (2023-01-26T18:07:21Z) - Leverage Score Sampling for Complete Mode Coverage in Generative
Adversarial Networks [11.595070613477548]
A generative model may overlook underrepresented modes that are less frequent in the empirical data distribution.
We propose a sampling procedure based on ridge leverage scores which significantly improves mode coverage when compared to standard methods.
arXiv Detail & Related papers (2021-04-06T09:00:38Z) - Efficient Debiased Evidence Estimation by Multilevel Monte Carlo
Sampling [0.0]
We propose a new optimization algorithm for Bayesian inference based multilevel Monte Carlo (MLMC) methods.
Our numerical results confirm considerable computational savings compared to the conventional estimators.
arXiv Detail & Related papers (2020-01-14T09:14:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.