Related papers: Who is the Winning Algorithm? Rank Aggregation for Comparative Studies

Who is the Winning Algorithm? Rank Aggregation for Comparative Studies

URL: http://arxiv.org/abs/2601.01664v1
Date: Sun, 04 Jan 2026 20:52:47 GMT
Title: Who is the Winning Algorithm? Rank Aggregation for Comparative Studies
Authors: Amichai Painsky,
Abstract summary: We introduce a novel conceptual framework for estimating the win probability for each of the m algorithms.<n>Our proposed framework significantly improves upon currently known methods in synthetic and real-world examples.
Score: 4.501091570166657
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Consider a collection of m competing machine learning algorithms. Given their performance on a benchmark of datasets, we would like to identify the best performing algorithm. Specifically, which algorithm is most likely to ``win'' (rank highest) on a future, unseen dataset. The standard maximum likelihood approach suggests counting the number of wins per each algorithm. In this work, we argue that there is much more information in the complete rankings. That is, the number of times that each algorithm finished second, third and so forth. Yet, it is not entirely clear how to effectively utilize this information for our purpose. In this work we introduce a novel conceptual framework for estimating the win probability for each of the m algorithms, given their complete rankings over a benchmark of datasets. Our proposed framework significantly improves upon currently known methods in synthetic and real-world examples.

Related papers

Near Optimal Inference for the Best-Performing Algorithm [6.5268245109828005]
We introduce a novel framework for the subset selection problem.<n>We provide both high confidence and finite-sample schemes that significantly improve upon currently known methods.
arXiv Detail & Related papers (2025-08-07T09:08:06Z)
Finding a Fair Scoring Function for Top-$k$ Selection: From Hardness to Practice [0.0]
We address the problem of identifying a fair linear scoring function for top-$k$ selection.<n>Existing algorithms do not scale efficiently, particularly in higher dimensions.<n>We turn this potential of efficiency into an optimized algorithm delivering substantial practical performance gains.
arXiv Detail & Related papers (2025-03-14T16:40:36Z)
A Novel Ranking Scheme for the Performance Analysis of Stochastic Optimization Algorithms using the Principles of Severity [9.310464457958844]
We provide a novel ranking scheme to rank the algorithms over multiple single-objective optimization problems. The results of the algorithms are compared using a robust bootstrapping-based hypothesis testing procedure.
arXiv Detail & Related papers (2024-05-31T19:35:34Z)
Comparing Personalized Relevance Algorithms for Directed Graphs [0.34952465649465553]
We present an interactive Web platform that, given a directed graph, allows identifying the most relevant nodes related to a given query node. We provide 50 pre-loaded datasets from Wikipedia, Twitter, and Amazon and seven algorithms. Our tool helps to uncover hidden relationships within the data, which makes of it a valuable addition to the repertoire of graph analysis algorithms.
arXiv Detail & Related papers (2024-05-03T17:24:08Z)
On Universally Optimal Algorithms for A/B Testing [49.429419538826444]
We study the problem of best-arm identification with fixed budget in multi-armed bandits with Bernoulli rewards. For the problem with two arms, also known as the A/B testing problem, we prove that there is no algorithm that performs as well as the algorithm sampling each arm equally.
arXiv Detail & Related papers (2023-08-23T08:38:53Z)
A Gold Standard Dataset for the Reviewer Assignment Problem [70.45113777449373]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.<n>Key challenge in comparing existing algorithms and developing better algorithms is the lack of publicly available gold-standard data.<n>We collect a novel dataset of similarity scores that we release to the research community.
arXiv Detail & Related papers (2023-03-23T16:15:03Z)
HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection [75.84584400866254]
We propose a new algorithm selector leveraging special forests, combining the strengths of both approaches while alleviating their weaknesses. HARRIS' decisions are based on a forest model, whose trees are created based on optimized on a hybrid ranking and regression loss function.
arXiv Detail & Related papers (2022-10-31T14:06:11Z)
A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data [0.0]
Counterfactual explanations are viewed as an effective way to explain machine learning predictions. There are already dozens of algorithms aiming to generate such explanations. benchmarking study and framework can help practitioners in determining which technique and building blocks most suit their context.
arXiv Detail & Related papers (2021-07-09T21:06:03Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
New Oracle-Efficient Algorithms for Private Synthetic Data Release [52.33506193761153]
We present three new algorithms for constructing differentially private synthetic data. The algorithms satisfy differential privacy even in the worst case. Compared to the state-of-the-art method High-Dimensional Matrix Mechanism citeMcKennaMHM18, our algorithms provide better accuracy in the large workload.
arXiv Detail & Related papers (2020-07-10T15:46:05Z)
Corralling Stochastic Bandit Algorithms [54.10645564702416]
We show that the regret of the corralling algorithms is no worse than that of the best algorithm containing the arm with the highest reward. We show that the gap between the highest reward and other rewards depends on the gap between the highest reward and other rewards.
arXiv Detail & Related papers (2020-06-16T15:33:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.