Minimax Hypothesis Testing for the Bradley-Terry-Luce Model
- URL: http://arxiv.org/abs/2410.08360v1
- Date: Thu, 10 Oct 2024 20:28:05 GMT
- Title: Minimax Hypothesis Testing for the Bradley-Terry-Luce Model
- Authors: Anuran Makur, Japneet Singh,
- Abstract summary: The Bradley-Terry-Luce (BTL) model is one of the most widely used models for ranking a collection of items or agents.
We propose a hypothesis test that determines whether a given pairwise comparison dataset, with $k$ comparisons per pair of agents, originates from an underlying BTL model.
- Score: 6.5990719141691825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Bradley-Terry-Luce (BTL) model is one of the most widely used models for ranking a collection of items or agents based on pairwise comparisons among them. Given $n$ agents, the BTL model endows each agent $i$ with a latent skill score $\alpha_i > 0$ and posits that the probability that agent $i$ is preferred over agent $j$ is $\alpha_i/(\alpha_i + \alpha_j)$. In this work, our objective is to formulate a hypothesis test that determines whether a given pairwise comparison dataset, with $k$ comparisons per pair of agents, originates from an underlying BTL model. We formalize this testing problem in the minimax sense and define the critical threshold of the problem. We then establish upper bounds on the critical threshold for general induced observation graphs (satisfying mild assumptions) and develop lower bounds for complete induced graphs. Our bounds demonstrate that for complete induced graphs, the critical threshold scales as $\Theta((nk)^{-1/2})$ in a minimax sense. In particular, our test statistic for the upper bounds is based on a new approximation we derive for the separation distance between general pairwise comparison models and the class of BTL models. To further assess the performance of our statistical test, we prove upper bounds on the type I and type II probabilities of error. Much of our analysis is conducted within the context of a fixed observation graph structure, where the graph possesses certain ``nice'' properties, such as expansion and bounded principal ratio. Additionally, we derive several auxiliary results, such as bounds on principal ratios of graphs, $\ell^2$-bounds on BTL parameter estimation under model mismatch, stability of rankings under the BTL model, etc. We validate our theoretical results through experiments on synthetic and real-world datasets and propose a data-driven permutation testing approach to determine test thresholds.
Related papers
- $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions [6.76974373198208]
We establish a fast convergence theory for a popular SDE-based sampler under minimal assumptions.
Our analysis shows that, provided $ell_2$-accurate estimates of the score functions, the total variation distance between the target and generated distributions is upper bounded by $O(d/T)$.
This is achieved through a novel set of analytical tools that provides a fine-grained characterization of how the error propagates at each step of the reverse process.
arXiv Detail & Related papers (2024-09-27T17:59:10Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Degree Heterogeneity in Higher-Order Networks: Inference in the Hypergraph $\boldsymbolβ$-Model [4.540236408836132]
We study the hypergraph $boldsymbolbeta$-model with multiple layers, which allows for hyperedges of different sizes across the layers.
We derive the rates of convergence of maximum likelihood (ML) estimates and establish their minimax rate optimality.
We also consider the goodness-of-fit problem in the hypergraph $boldsymbolbeta$-model.
arXiv Detail & Related papers (2023-07-06T07:23:06Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Uncertainty Quantification of MLE for Entity Ranking with Covariates [3.2839905453386162]
This paper concerns with statistical estimation and inference for the ranking problems based on pairwise comparisons.
We propose a novel model, Co-Assisted Ranking Estimation (CARE) model, that extends the well-known Bradley-Terry-Luce (BTL) model.
We derive the maximum likelihood estimator of $alpha_i*_i=1n$ and $beta*$ under a sparse comparison graph.
We validate our theoretical results through large-scale numerical studies and an application to the mutual fund stock holding dataset.
arXiv Detail & Related papers (2022-12-20T02:28:27Z) - The Performance of the MLE in the Bradley-Terry-Luce Model in
$\ell_{\infty}$-Loss and under General Graph Topologies [76.61051540383494]
We derive novel, general upper bounds on the $ell_infty$ estimation error of the Bradley-Terry-Luce model.
We demonstrate that the derived bounds perform well and in some cases are sharper compared to known results.
arXiv Detail & Related papers (2021-10-20T23:46:35Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - On the Discrepancy between Density Estimation and Sequence Generation [92.70116082182076]
log-likelihood is highly correlated with BLEU when we consider models within the same family.
We observe no correlation between rankings of models across different families.
arXiv Detail & Related papers (2020-02-17T20:13:35Z) - A Precise High-Dimensional Asymptotic Theory for Boosting and
Minimum-$\ell_1$-Norm Interpolated Classifiers [3.167685495996986]
This paper establishes a precise high-dimensional theory for boosting on separable data.
Under a class of statistical models, we provide an exact analysis of the universality error of boosting.
We also explicitly pin down the relation between the boosting test error and the optimal Bayes error.
arXiv Detail & Related papers (2020-02-05T00:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.