Ranking Inferences Based on the Top Choice of Multiway Comparisons
- URL: http://arxiv.org/abs/2211.11957v1
- Date: Tue, 22 Nov 2022 02:34:52 GMT
- Title: Ranking Inferences Based on the Top Choice of Multiway Comparisons
- Authors: Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu
- Abstract summary: This paper considers ranking inference of $n$ items based on the observed data on the top choice among $M$ randomly selected items at each trial.
It is a useful modification of the Plackett-Luce model for $M$-way ranking with only the top choice observed and is an extension of the celebrated Bradley-Terry-Luce model that corresponds to $M=2$.
- Score: 2.468314282946207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers ranking inference of $n$ items based on the observed
data on the top choice among $M$ randomly selected items at each trial. This is
a useful modification of the Plackett-Luce model for $M$-way ranking with only
the top choice observed and is an extension of the celebrated
Bradley-Terry-Luce model that corresponds to $M=2$. Under a uniform sampling
scheme in which any $M$ distinguished items are selected for comparisons with
probability $p$ and the selected $M$ items are compared $L$ times with
multinomial outcomes, we establish the statistical rates of convergence for
underlying $n$ preference scores using both $\ell_2$-norm and
$\ell_\infty$-norm, with the minimum sampling complexity. In addition, we
establish the asymptotic normality of the maximum likelihood estimator that
allows us to construct confidence intervals for the underlying scores.
Furthermore, we propose a novel inference framework for ranking items through a
sophisticated maximum pairwise difference statistic whose distribution is
estimated via a valid Gaussian multiplier bootstrap. The estimated distribution
is then used to construct simultaneous confidence intervals for the differences
in the preference scores and the ranks of individual items. They also enable us
to address various inference questions on the ranks of these items. Extensive
simulation studies lend further support to our theoretical results. A real data
application illustrates the usefulness of the proposed methods convincingly.
Related papers
- Optimal level set estimation for non-parametric tournament and crowdsourcing problems [49.75262185577198]
Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of $n$ experts on $d$ questions.
In this paper, we assume that the matrix $M$ containing the probability that expert $i$ answers correctly to question $j$ is bi-isotonic up to a permutation of it rows and columns.
We construct an efficient-time algorithm that turns out to be minimax optimal for this classification problem.
arXiv Detail & Related papers (2024-08-27T18:28:31Z) - Random pairing MLE for estimation of item parameters in Rasch model [22.32547146723177]
Rasch model is widely used in psychometrics to model the relationship between individuals' latent traits and their binary responses.
We introduce a new likelihood-based estimator that faithfully estimates the item parameters in the Rasch model.
We provide empirical evidence of the efficacy of the two new estimators using both simulated and real data.
arXiv Detail & Related papers (2024-06-20T04:32:34Z) - Variance Alignment Score: A Simple But Tough-to-Beat Data Selection
Method for Multimodal Contrastive Learning [17.40655778450583]
We propose a principled metric named Variance Alignment Score (VAS), which has the form $langle Sigma_texttest, Sigma_irangle$.
We show that applying VAS and CLIP scores together can outperform baselines by a margin of $1.3%$ on 38 evaluation sets for noisy dataset DataComp and $2.5%$ on VTAB for high-quality dataset CC12M.
arXiv Detail & Related papers (2024-02-03T06:29:04Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline.
$nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z) - Uncertainty Quantification of MLE for Entity Ranking with Covariates [3.2839905453386162]
This paper concerns with statistical estimation and inference for the ranking problems based on pairwise comparisons.
We propose a novel model, Co-Assisted Ranking Estimation (CARE) model, that extends the well-known Bradley-Terry-Luce (BTL) model.
We derive the maximum likelihood estimator of $alpha_i*_i=1n$ and $beta*$ under a sparse comparison graph.
We validate our theoretical results through large-scale numerical studies and an application to the mutual fund stock holding dataset.
arXiv Detail & Related papers (2022-12-20T02:28:27Z) - On Top-$k$ Selection from $m$-wise Partial Rankings via Borda Counting [23.290889341552898]
We analyze the performance of the Borda counting algorithm in a non-parametric model.
We show that if $Delta_k$ is greater than certain value, then the top-$k$ items selected by the algorithm is almost surely accurate.
arXiv Detail & Related papers (2022-04-11T16:31:37Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Learning-to-Rank with Partitioned Preference: Fast Estimation for the
Plackett-Luce Model [24.923231199480433]
Given $N$ items with $M$ partitions, calculating the likelihood of data with partitioned preference under the PL model has a time complexity of $O(N+S!)$.
We propose an efficient numerical integration approach for calculating the likelihood and its gradients with a time complexity $O(N+S3)$.
arXiv Detail & Related papers (2020-06-09T06:11:21Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z) - SetRank: A Setwise Bayesian Approach for Collaborative Ranking from
Implicit Feedback [50.13745601531148]
We propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to accommodate the characteristics of implicit feedback in recommender system.
Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons.
We also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to $sqrtM/N$.
arXiv Detail & Related papers (2020-02-23T06:40:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.