A Bayesian Bradley-Terry model to compare multiple ML algorithms on
multiple data sets
- URL: http://arxiv.org/abs/2208.04935v2
- Date: Sat, 15 Jul 2023 10:28:33 GMT
- Title: A Bayesian Bradley-Terry model to compare multiple ML algorithms on
multiple data sets
- Authors: Jacques Wainer
- Abstract summary: This paper proposes a Bayesian model to compare multiple algorithms on multiple data sets, on any metric.
The model is based on the Bradley-Terry model, that counts the number of times one algorithm performs better than another on different data sets.
- Score: 4.394728504061753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a Bayesian model to compare multiple algorithms on
multiple data sets, on any metric. The model is based on the Bradley-Terry
model, that counts the number of times one algorithm performs better than
another on different data sets. Because of its Bayesian foundations, the
Bayesian Bradley Terry model (BBT) has different characteristics than
frequentist approaches to comparing multiple algorithms on multiple data sets,
such as Demsar (2006) tests on mean rank, and Benavoli et al. (2016) multiple
pairwise Wilcoxon tests with p-adjustment procedures. In particular, a Bayesian
approach allows for more nuanced statements regarding the algorithms beyond
claiming that the difference is or it is not statistically significant.
Bayesian approaches also allow to define when two algorithms are equivalent for
practical purposes, or the region of practical equivalence (ROPE). Different
than a Bayesian signed rank comparison procedure proposed by Benavoli et al.
(2017), our approach can define a ROPE for any metric, since it is based on
probability statements, and not on differences of that metric. This paper also
proposes a local ROPE concept, that evaluates whether a positive difference
between a mean measure across some cross validation to the mean of some other
algorithms is should be really seen as the first algorithm being better than
the second, based on effect sizes. This local ROPE proposal is independent of a
Bayesian use, and can be used in frequentist approaches based on ranks. A R
package and a Python program that implements the BBT is available.
Related papers
- HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection [75.84584400866254]
We propose a new algorithm selector leveraging special forests, combining the strengths of both approaches while alleviating their weaknesses.
HARRIS' decisions are based on a forest model, whose trees are created based on optimized on a hybrid ranking and regression loss function.
arXiv Detail & Related papers (2022-10-31T14:06:11Z) - Efficient Approximate Kernel Based Spike Sequence Classification [56.2938724367661]
Machine learning models, such as SVM, require a definition of distance/similarity between pairs of sequences.
Exact methods yield better classification performance, but they pose high computational costs.
We propose a series of ways to improve the performance of the approximate kernel in order to enhance its predictive performance.
arXiv Detail & Related papers (2022-09-11T22:44:19Z) - Efficient computation of rankings from pairwise comparisons [0.0]
We describe an alternative and similarly simple iteration that provably returns identical results but does so much faster.
We demonstrate this algorithm with applications to a range of example data sets and derive a number of results regarding its convergence.
arXiv Detail & Related papers (2022-06-30T19:39:09Z) - Ranking with Confidence for Large Scale Comparison Data [1.2183405753834562]
In this work, we leverage a generative data model considering comparison noise to develop a fast, precise, and informative ranking from pairwise comparisons.
In real data, PD-Rank requires less computational time to achieve the same Kendall algorithm than active learning methods.
arXiv Detail & Related papers (2022-02-03T16:36:37Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - The FMRIB Variational Bayesian Inference Tutorial II: Stochastic
Variational Bayes [1.827510863075184]
This tutorial revisits the original FMRIB Variational Bayes tutorial.
This new approach bears a lot of similarity to, and has benefited from, computational methods applied to machine learning algorithms.
arXiv Detail & Related papers (2020-07-03T11:31:52Z) - Approximating a Target Distribution using Weight Queries [25.392248158616862]
We propose an interactive algorithm that iteratively selects data set examples and performs corresponding weight queries.
We derive an approximation bound on the total variation distance between the reweighting found by the algorithm and the best achievable reweighting.
arXiv Detail & Related papers (2020-06-24T11:17:43Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Active Sampling for Pairwise Comparisons via Approximate Message Passing
and Information Gain Maximization [5.771869590520189]
We propose ASAP, an active sampling algorithm based on approximate message passing and expected information gain.
We show that ASAP offers the highest accuracy of inferred scores compared to the existing methods.
arXiv Detail & Related papers (2020-04-12T20:48:10Z) - LSF-Join: Locality Sensitive Filtering for Distributed All-Pairs Set
Similarity Under Skew [58.21885402826496]
All-pairs set similarity is a widely used data mining task, even for large and high-dimensional datasets.
We present a new distributed algorithm, LSF-Join, for approximate all-pairs set similarity.
We show that LSF-Join efficiently finds most close pairs, even for small similarity thresholds and for skewed input sets.
arXiv Detail & Related papers (2020-03-06T00:06:20Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.