Related papers: A Brief Tutorial on Sample Size Calculations for Fairness Audits

A Brief Tutorial on Sample Size Calculations for Fairness Audits

URL: http://arxiv.org/abs/2312.04745v1
Date: Thu, 7 Dec 2023 22:59:12 GMT
Title: A Brief Tutorial on Sample Size Calculations for Fairness Audits
Authors: Harvineet Singh, Fan Xia, Mi-Ok Kim, Romain Pirracchio, Rumi Chunara, Jean Feng
Abstract summary: This tutorial provides guidance on how to determine the required subgroup sample sizes for a fairness audit. Our findings are applicable to audits of binary classification models and multiple fairness metrics derived as summaries of the confusion matrix.
Score: 6.66743248310448
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In fairness audits, a standard objective is to detect whether a given algorithm performs substantially differently between subgroups. Properly powering the statistical analysis of such audits is crucial for obtaining informative fairness assessments, as it ensures a high probability of detecting unfairness when it exists. However, limited guidance is available on the amount of data necessary for a fairness audit, lacking directly applicable results concerning commonly used fairness metrics. Additionally, the consideration of unequal subgroup sample sizes is also missing. In this tutorial, we address these issues by providing guidance on how to determine the required subgroup sample sizes to maximize the statistical power of hypothesis tests for detecting unfairness. Our findings are applicable to audits of binary classification models and multiple fairness metrics derived as summaries of the confusion matrix. Furthermore, we discuss other aspects of audit study designs that can increase the reliability of audit results.

Related papers

Whence Is A Model Fair? Fixing Fairness Bugs via Propensity Score Matching [0.49157446832511503]
We investigate whether the way training and testing data are sampled affects the reliability of fairness metrics. Since training and test sets are often randomly sampled from the same population, bias present in the training data may still exist in the test data. We propose FairMatch, a post-processing method that applies propensity score matching to evaluate and mitigate bias.
arXiv Detail & Related papers (2025-04-23T19:28:30Z)
Auditing for Bias in Ad Delivery Using Inferred Demographic Attributes [50.37313459134418]
We study the effects of inference error on auditing for bias in one prominent application: black-box audit of ad delivery using paired ads. We propose a way to mitigate the inference error when evaluating skew in ad delivery algorithms.
arXiv Detail & Related papers (2024-10-30T18:57:03Z)
Finite-Sample and Distribution-Free Fair Classification: Optimal Trade-off Between Excess Risk and Fairness, and the Cost of Group-Blindness [14.421493372559762]
We quantify the impact of enforcing algorithmic fairness and group-blindness in binary classification under group fairness constraints. We propose a unified framework for fair classification that provides distribution-free and finite-sample fairness guarantees with controlled excess risk.
arXiv Detail & Related papers (2024-10-21T20:04:17Z)
Sampling Audit Evidence Using a Naive Bayes Classifier [0.0]
This study advances sampling techniques by integrating machine learning with sampling. Machine learning integration helps avoid sampling bias, keep randomness and variability, and target risker samples.
arXiv Detail & Related papers (2024-03-21T01:35:03Z)
A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups. We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z)
A Universal Unbiased Method for Classification from Aggregate Observations [115.20235020903992]
This paper presents a novel universal method of CFAO, which holds an unbiased estimator of the classification risk for arbitrary losses. Our proposed method not only guarantees the risk consistency due to the unbiased risk estimator but also can be compatible with arbitrary losses.
arXiv Detail & Related papers (2023-06-20T07:22:01Z)
Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias. We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates. We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z)
Statistical Inference for Fairness Auditing [4.318555434063274]
We frame this task as "fairness auditing," in terms of multiple hypothesis testing. We show how the bootstrap can be used to simultaneously bound performance disparities over a collection of groups. Our methods can be used to flag subpopulations affected by model underperformance, and certify subpopulations for which the model performs adequately.
arXiv Detail & Related papers (2023-05-05T17:54:22Z)
Error Parity Fairness: Testing for Group Fairness in Regression Tasks [5.076419064097733]
This work presents error parity as a regression fairness notion and introduces a testing methodology to assess group fairness. It is followed by a suitable permutation test to compare groups on several statistics to explore disparities and identify impacted groups. Overall, the proposed regression fairness testing methodology fills a gap in the fair machine learning literature and may serve as a part of larger accountability assessments and algorithm audits.
arXiv Detail & Related papers (2022-08-16T17:47:20Z)
Measuring Fairness Under Unawareness of Sensitive Attributes: A Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes. We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z)
Testing Group Fairness via Optimal Transport Projections [12.972104025246091]
The proposed test is a flexible, interpretable, and statistically rigorous tool for auditing whether exhibited biases are to the perturbation or due to the randomness in the data. The statistical challenges, which may arise from multiple impact criteria that define group fairness, are conveniently tackled by projecting the empirical measure onto the set of group-fair probability models. The proposed framework can also be used to test for testing composite intrinsic fairness hypotheses and fairness with multiple sensitive attributes.
arXiv Detail & Related papers (2021-06-02T10:51:39Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.