On (assessing) the fairness of risk score models
- URL: http://arxiv.org/abs/2302.08851v1
- Date: Fri, 17 Feb 2023 12:45:51 GMT
- Title: On (assessing) the fairness of risk score models
- Authors: Eike Petersen, Melanie Ganz, Sune Hannibal Holm, Aasa Feragen
- Abstract summary: Risk models are of interest for a number of reasons, including the fact that they communicate uncertainty about the potential outcomes to users.
We identify the provision of similar value to different groups as a key desideratum for risk score fairness.
We introduce a novel calibration error metric that is less sample size-biased than previously proposed metrics.
- Score: 2.0646127669654826
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent work on algorithmic fairness has largely focused on the fairness of
discrete decisions, or classifications. While such decisions are often based on
risk score models, the fairness of the risk models themselves has received
considerably less attention. Risk models are of interest for a number of
reasons, including the fact that they communicate uncertainty about the
potential outcomes to users, thus representing a way to enable meaningful human
oversight. Here, we address fairness desiderata for risk score models. We
identify the provision of similar epistemic value to different groups as a key
desideratum for risk score fairness. Further, we address how to assess the
fairness of risk score models quantitatively, including a discussion of metric
choices and meaningful statistical comparisons between groups. In this context,
we also introduce a novel calibration error metric that is less sample
size-biased than previously proposed metrics, enabling meaningful comparisons
between groups of different sizes. We illustrate our methodology - which is
widely applicable in many other settings - in two case studies, one in
recidivism risk prediction, and one in risk of major depressive disorder (MDD)
prediction.
Related papers
- Data-driven decision-making under uncertainty with entropic risk measure [5.407319151576265]
The entropic risk measure is widely used in high-stakes decision making to account for tail risks associated with an uncertain loss.
To debias the empirical entropic risk estimator, we propose a strongly consistent bootstrapping procedure.
We show that cross validation methods can result in significantly higher out-of-sample risk for the insurer if the bias in validation performance is not corrected for.
arXiv Detail & Related papers (2024-09-30T04:02:52Z) - Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - On the Societal Impact of Open Foundation Models [93.67389739906561]
We focus on open foundation models, defined here as those with broadly available model weights.
We identify five distinctive properties of open foundation models that lead to both their benefits and risks.
arXiv Detail & Related papers (2024-02-27T16:49:53Z) - Risk Aware Benchmarking of Large Language Models [36.95053112313244]
We propose a distributional framework for benchmarking socio-technical risks of foundation models with quantified statistical significance.
We show that the second order statistics in this test are linked to mean-risk models commonly used in econometrics and mathematical finance.
We use our framework to compare various large language models regarding risks related to drifting from instructions and outputting toxic content.
arXiv Detail & Related papers (2023-10-11T02:08:37Z) - In Search of Insights, Not Magic Bullets: Towards Demystification of the
Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria.
We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z) - Mitigating multiple descents: A model-agnostic framework for risk
monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation.
We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z) - Two steps to risk sensitivity [4.974890682815778]
conditional value-at-risk (CVaR) is a risk measure for modeling human and animal planning.
We adopt a conventional distributional approach to CVaR in a sequential setting and reanalyze the choices of human decision-makers.
We then consider a further critical property of risk sensitivity, namely time consistency, showing alternatives to this form of CVaR.
arXiv Detail & Related papers (2021-11-12T16:27:47Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Feedback Effects in Repeat-Use Criminal Risk Assessments [0.0]
We show that risk can propagate over sequential decisions in ways that are not captured by one-shot tests.
Risk assessment tools operate in a highly complex and path-dependent process, fraught with historical inequity.
arXiv Detail & Related papers (2020-11-28T06:40:05Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.