fairmetrics: An R package for group fairness evaluation
- URL: http://arxiv.org/abs/2506.06243v2
- Date: Thu, 19 Jun 2025 00:00:46 GMT
- Title: fairmetrics: An R package for group fairness evaluation
- Authors: Benjamin Smith, Jianhui Gao, Jessica Gronsbell,
- Abstract summary: fairmetrics R package offers a user-friendly framework for rigorously evaluating group-based fairness criteria.<n>Group-based fairness criteria assess whether a model is equally accurate or well-calibrated across a set of predefined groups.<n>fairmetrics provides both point and interval estimates for multiple metrics through a convenient wrapper function.
- Score: 0.40964539027092906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fairness is a growing area of machine learning (ML) that focuses on ensuring models do not produce systematically biased outcomes for specific groups, particularly those defined by protected attributes such as race, gender, or age. Evaluating fairness is a critical aspect of ML model development, as biased models can perpetuate structural inequalities. The {fairmetrics} R package offers a user-friendly framework for rigorously evaluating numerous group-based fairness criteria, including metrics based on independence (e.g., statistical parity), separation (e.g., equalized odds), and sufficiency (e.g., predictive parity). Group-based fairness criteria assess whether a model is equally accurate or well-calibrated across a set of predefined groups so that appropriate bias mitigation strategies can be implemented. {fairmetrics} provides both point and interval estimates for multiple metrics through a convenient wrapper function and includes an example dataset derived from the Medical Information Mart for Intensive Care, version II (MIMIC-II) database (Goldberger et al., 2000; Raffa, 2016).
Related papers
- Quantifying Query Fairness Under Unawareness [82.33181164973365]
We introduce a robust fairness estimator based on quantification that effectively handles multiple sensitive attributes beyond binary classifications.<n>Our method outperforms existing baselines across various sensitive attributes and is the first to establish a reliable protocol for measuring fairness under unawareness.
arXiv Detail & Related papers (2025-06-04T16:31:44Z) - Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs [7.197702136906138]
We propose an uncertainty-aware fairness metric, UCerF, to enable a fine-grained evaluation of model fairness.<n> observing data size, diversity, and clarity issues in current datasets, we introduce a new gender-occupation fairness evaluation dataset.<n>We establish a benchmark, using our metric and dataset, and apply it to evaluate the behavior of ten open-source AI systems.
arXiv Detail & Related papers (2025-05-29T20:45:18Z) - Looking Beyond What You See: An Empirical Analysis on Subgroup Intersectional Fairness for Multi-label Chest X-ray Classification Using Social Determinants of Racial Health Inequities [4.351859373879489]
Inherited biases in deep learning models can lead to disparities in prediction accuracy across protected groups.
We propose a framework to achieve accurate diagnostic outcomes and ensure fairness across intersectional groups.
arXiv Detail & Related papers (2024-03-27T02:13:20Z) - A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups.
We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - The Unbearable Weight of Massive Privilege: Revisiting Bias-Variance
Trade-Offs in the Context of Fair Prediction [7.975779552420981]
We propose a conditional-iid (ciid) model that seeks to improve on the trade-offs made by a single model.
We empirically test our setup on the COMPAS and folktables datasets.
Our analysis suggests that there might be principled procedures and concrete real-world use cases under which conditional models are preferred.
arXiv Detail & Related papers (2023-02-17T05:34:35Z) - Learning Informative Representation for Fairness-aware Multivariate
Time-series Forecasting: A Group-based Perspective [50.093280002375984]
Performance unfairness among variables widely exists in multivariate time series (MTS) forecasting models.
We propose a novel framework, named FairFor, for fairness-aware MTS forecasting.
arXiv Detail & Related papers (2023-01-27T04:54:12Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Fairness by Explicability and Adversarial SHAP Learning [0.0]
We propose a new definition of fairness that emphasises the role of an external auditor and model explicability.
We develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model.
We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset.
arXiv Detail & Related papers (2020-03-11T14:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.