Properties of Group Fairness Metrics for Rankings
- URL: http://arxiv.org/abs/2212.14351v1
- Date: Thu, 29 Dec 2022 15:50:18 GMT
- Title: Properties of Group Fairness Metrics for Rankings
- Authors: Tobias Schumacher, Marlene Lutz, Sandipan Sikdar, Markus Strohmaier
- Abstract summary: We perform a comparative analysis of existing group fairness metrics developed in the context of fair ranking.
We take an axiomatic approach whereby we design a set of thirteen properties for group fairness metrics.
We demonstrate that most of these metrics only satisfy a small subset of the proposed properties.
- Score: 4.479834103607384
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, several metrics have been developed for evaluating group
fairness of rankings. Given that these metrics were developed with different
application contexts and ranking algorithms in mind, it is not straightforward
which metric to choose for a given scenario. In this paper, we perform a
comprehensive comparative analysis of existing group fairness metrics developed
in the context of fair ranking. By virtue of their diverse application
contexts, we argue that such a comparative analysis is not straightforward.
Hence, we take an axiomatic approach whereby we design a set of thirteen
properties for group fairness metrics that consider different ranking settings.
A metric can then be selected depending on whether it satisfies all or a subset
of these properties. We apply these properties on eleven existing group
fairness metrics, and through both empirical and theoretical results we
demonstrate that most of these metrics only satisfy a small subset of the
proposed properties. These findings highlight limitations of existing metrics,
and provide insights into how to evaluate and interpret different fairness
metrics in practical deployment. The proposed properties can also assist
practitioners in selecting appropriate metrics for evaluating fairness in a
specific application.
Related papers
- FSDEM: Feature Selection Dynamic Evaluation Metric [1.54369283425087]
The proposed metric is a dynamic metric with two properties that can be used to evaluate both the performance and the stability of a feature selection algorithm.
We conduct several empirical experiments to illustrate the use of the proposed metric in the successful evaluation of feature selection algorithms.
arXiv Detail & Related papers (2024-08-26T12:49:41Z) - On the Intrinsic and Extrinsic Fairness Evaluation Metrics for
Contextualized Language Representations [74.70957445600936]
Multiple metrics have been introduced to measure fairness in various natural language processing tasks.
These metrics can be roughly categorized into two categories: 1) emphextrinsic metrics for evaluating fairness in downstream applications and 2) emphintrinsic metrics for estimating fairness in upstream language representation models.
arXiv Detail & Related papers (2022-03-25T22:17:43Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Measuring Disparate Outcomes of Content Recommendation Algorithms with
Distributional Inequality Metrics [5.74271110290378]
We evaluate a set of metrics originating from economics, distributional inequality metrics, and their ability to measure disparities in content exposure in the Twitter algorithmic timeline.
We show that we can use these metrics to identify content suggestion algorithms that contribute more strongly to skewed outcomes between users.
arXiv Detail & Related papers (2022-02-03T14:41:39Z) - QAFactEval: Improved QA-Based Factual Consistency Evaluation for
Summarization [116.56171113972944]
We show that carefully choosing the components of a QA-based metric is critical to performance.
Our solution improves upon the best-performing entailment-based metric and achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-12-16T00:38:35Z) - Estimation of Fair Ranking Metrics with Incomplete Judgments [70.37717864975387]
We propose a sampling strategy and estimation technique for four fair ranking metrics.
We formulate a robust and unbiased estimator which can operate even with very limited number of labeled items.
arXiv Detail & Related papers (2021-08-11T10:57:00Z) - Fair Performance Metric Elicitation [29.785862520452955]
We consider the choice of fairness metrics through the lens of metric elicitation.
We propose a novel strategy to elicit group-fair performance metrics for multiclass classification problems.
arXiv Detail & Related papers (2020-06-23T04:03:24Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers.
This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.