Spectral Ranking Inferences based on General Multiway Comparisons
- URL: http://arxiv.org/abs/2308.02918v3
- Date: Fri, 1 Mar 2024 17:24:23 GMT
- Title: Spectral Ranking Inferences based on General Multiway Comparisons
- Authors: Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu
- Abstract summary: We show that a two-step spectral method can achieve the same vanilla efficiency as the Maximum Likelihood Estor.
It is noteworthy that this is the first time effective two-sample rank testing methods have been proposed.
- Score: 7.222667862159246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the performance of the spectral method in the estimation
and uncertainty quantification of the unobserved preference scores of compared
entities in a general and more realistic setup. Specifically, the comparison
graph consists of hyper-edges of possible heterogeneous sizes, and the number
of comparisons can be as low as one for a given hyper-edge. Such a setting is
pervasive in real applications, circumventing the need to specify the graph
randomness and the restrictive homogeneous sampling assumption imposed in the
commonly used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models.
Furthermore, in scenarios where the BTL or PL models are appropriate, we
unravel the relationship between the spectral estimator and the Maximum
Likelihood Estimator (MLE). We discover that a two-step spectral method, where
we apply the optimal weighting estimated from the equal weighting vanilla
spectral method, can achieve the same asymptotic efficiency as the MLE. Given
the asymptotic distributions of the estimated preference scores, we also
introduce a comprehensive framework to carry out both one-sample and two-sample
ranking inferences, applicable to both fixed and random graph settings. It is
noteworthy that this is the first time effective two-sample rank testing
methods have been proposed. Finally, we substantiate our findings via
comprehensive numerical simulations and subsequently apply our developed
methodologies to perform statistical inferences for statistical journals and
movie rankings.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference) [3.622435665395788]
We introduce a kernel-based measure for detecting differences between two conditional distributions.
When the two conditional distributions are the same, the estimate has a Gaussian limit and its variance has a simple form that can be easily estimated from the data.
We also provide a resampling based test using our estimate that applies to the conditional goodness-of-fit problem.
arXiv Detail & Related papers (2024-07-23T15:04:38Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Statistical inference for pairwise comparison models [5.487882744996216]
This paper establishes a near-optimal normality for the maximum likelihood in a broad class of pairwise comparison models.
The key idea lies in identifying the Fisher information matrix as a weighted graph Laplacian, which can be studied via a meticulous spectral analysis.
arXiv Detail & Related papers (2024-01-16T16:14:09Z) - On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates [5.13323375365494]
We provide theoretical guarantees for the convergence behaviour of diffusion-based generative models under strongly log-concave data.
Our class of functions used for score estimation is made of Lipschitz continuous functions avoiding any Lipschitzness assumption on the score function.
This approach yields the best known convergence rate for our sampling algorithm.
arXiv Detail & Related papers (2023-11-22T18:40:45Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - Tuning Stochastic Gradient Algorithms for Statistical Inference via
Large-Sample Asymptotics [18.93569692490218]
tuning of gradient algorithms often based on trial-and-error rather than generalizable theory.
We show that averaging with a large fixed step size is robust to the choice of tuning parameters.
We lay the foundation for a systematic analysis of other gradient Monte Carlo algorithms.
arXiv Detail & Related papers (2022-07-25T17:58:09Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Uncertainty quantification in the Bradley-Terry-Luce model [14.994932962403935]
This paper focuses on two estimators that have received much recent attention: the maximum likelihood estimator (MLE) and the spectral estimator.
Using a unified proof strategy, we derive sharp and uniform non-asymptotic expansions for both estimators in the sparsest possible regime.
Our proof is based on a self-consistent equation of the second-order vector and a novel leave-two-out analysis.
arXiv Detail & Related papers (2021-10-08T03:06:30Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.