Assumption-free stability for ranking problems
- URL: http://arxiv.org/abs/2506.02257v1
- Date: Mon, 02 Jun 2025 21:02:13 GMT
- Title: Assumption-free stability for ranking problems
- Authors: Ruiting Liang, Jake A. Soloff, Rina Foygel Barber, Rebecca Willett,
- Abstract summary: We consider ranking problems among a finite set of candidates, for instance, selecting the top-$k$ items among a larger list of candidates or obtaining the full ranking of all items in the set.<n>These problems are often unstable, in the sense that estimating a ranking from noisy data can exhibit high sensitivity to small perturbations.<n>We propose two novel ranking operators for achieving stable ranking: the emphinflated top-$k$ for the top-$k$ selection problem and the emphinflated full ranking for ranking the full list.
- Score: 10.487111643210365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we consider ranking problems among a finite set of candidates: for instance, selecting the top-$k$ items among a larger list of candidates or obtaining the full ranking of all items in the set. These problems are often unstable, in the sense that estimating a ranking from noisy data can exhibit high sensitivity to small perturbations. Concretely, if we use data to provide a score for each item (say, by aggregating preference data over a sample of users), then for two items with similar scores, small fluctuations in the data can alter the relative ranking of those items. Many existing theoretical results for ranking problems assume a separation condition to avoid this challenge, but real-world data often contains items whose scores are approximately tied, limiting the applicability of existing theory. To address this gap, we develop a new algorithmic stability framework for ranking problems, and propose two novel ranking operators for achieving stable ranking: the \emph{inflated top-$k$} for the top-$k$ selection problem and the \emph{inflated full ranking} for ranking the full list. To enable stability, each method allows for expressing some uncertainty in the output. For both of these two problems, our proposed methods provide guaranteed stability, with no assumptions on data distributions and no dependence on the total number of candidates to be ranked. Experiments on real-world data confirm that the proposed methods offer stability without compromising the informativeness of the output.
Related papers
- A comparative analysis of rank aggregation methods for the partial label ranking problem [10.994154016400147]
The label ranking problem is a supervised learning scenario in which the learner predicts a total order of the class labels for a given input instance.<n>This paper explores several alternative aggregation methods for this critical step, including scoring-based and non-parametric probabilistic-based rank aggregation approaches.
arXiv Detail & Related papers (2025-02-24T11:44:43Z) - Rank Aggregation in Crowdsourcing for Listwise Annotations [37.0231203825557]
We propose a Listwise rank Aggregation method in Crowdsourcing, where the global position information is carefully measured and included.
In our design, an especially proposed annotation quality indicator is employed to measure the discrepancy between the annotated rank and the true rank.
We also take the difficulty of the ranking problem itself into consideration, as it directly impacts the performance of annotators.
arXiv Detail & Related papers (2024-10-10T02:13:21Z) - Stability and Multigroup Fairness in Ranking with Uncertain Predictions [61.76378420347408]
Our work considers ranking functions: maps from individual predictions for a classification task to distributions over rankings.
We focus on two aspects of ranking functions: stability to perturbations in predictions and fairness towards both individuals and subgroups.
Our work demonstrates that uncertainty aware rankings naturally interpolate between group and individual level fairness guarantees.
arXiv Detail & Related papers (2024-02-14T17:17:05Z) - An information theoretic approach to quantify the stability of feature
selection and ranking algorithms [0.0]
We propose an information theoretic approach based on the Jensen Shannon divergence to quantify this robustness.
Unlike other stability measures, this metric is suitable for different algorithm outcomes: full ranked lists, feature subsets as well as the lesser studied partial ranked lists.
We illustrate the use of this stability metric with data generated in a fully controlled way and compare it with popular metrics including the Spearmans rank correlation and the Kunchevas index on feature ranking and selection outcomes, respectively.
arXiv Detail & Related papers (2024-02-07T22:17:37Z) - Replace Scoring with Arrangement: A Contextual Set-to-Arrangement
Framework for Learning-to-Rank [40.81502990315285]
Learning-to-rank is a core technique in the top-N recommendation task, where an ideal ranker would be a mapping from an item set to an arrangement.
Most existing solutions fall in the paradigm of probabilistic ranking principle (PRP), i.e., first score each item in the candidate set and then perform a sort operation to generate the top ranking list.
We propose Set-To-Arrangement Ranking (STARank), a new framework directly generates the permutations of the candidate items without the need for individually scoring and sort operations.
arXiv Detail & Related papers (2023-08-05T12:22:26Z) - Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking.
xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics.
We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z) - Subset Selection Based On Multiple Rankings in the Presence of Bias:
Effectiveness of Fairness Constraints for Multiwinner Voting Score Functions [42.23968578772441]
We consider the problem of subset selection where one is given multiple rankings of items and the goal is to select the highest quality''
We show that for fairness constraints to be effective, different multiwinner score functions may require a drastically different number of rankings.
arXiv Detail & Related papers (2023-06-16T13:25:27Z) - Integrating Rankings into Quantized Scores in Peer Review [61.27794774537103]
In peer review, reviewers are usually asked to provide scores for the papers.
To mitigate this issue, conferences have started to ask reviewers to additionally provide a ranking of the papers they have reviewed.
There are no standard procedure for using this ranking information and Area Chairs may use it in different ways.
We take a principled approach to integrate the ranking information into the scores.
arXiv Detail & Related papers (2022-04-05T19:39:13Z) - Adaptive Sampling for Heterogeneous Rank Aggregation from Noisy Pairwise
Comparisons [85.5955376526419]
In rank aggregation problems, users exhibit various accuracy levels when comparing pairs of items.
We propose an elimination-based active sampling strategy, which estimates the ranking of items via noisy pairwise comparisons.
We prove that our algorithm can return the true ranking of items with high probability.
arXiv Detail & Related papers (2021-10-08T13:51:55Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - SetRank: A Setwise Bayesian Approach for Collaborative Ranking from
Implicit Feedback [50.13745601531148]
We propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to accommodate the characteristics of implicit feedback in recommender system.
Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons.
We also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to $sqrtM/N$.
arXiv Detail & Related papers (2020-02-23T06:40:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.