Comprehensive Equity Index (CEI): Definition and Application to Bias Evaluation in Biometrics
- URL: http://arxiv.org/abs/2409.01928v1
- Date: Tue, 3 Sep 2024 14:19:38 GMT
- Title: Comprehensive Equity Index (CEI): Definition and Application to Bias Evaluation in Biometrics
- Authors: Imanol Solano, Alejandro Peña, Aythami Morales, Julian Fierrez, Ruben Tolosana, Francisco Zamora-Martinez, Javier San Agustin,
- Abstract summary: We present a novel metric to quantify biased behaviors of machine learning models.
We focus on and apply it to the operational evaluation of face recognition systems.
- Score: 47.762333925222926
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a novel metric designed, among other applications, to quantify biased behaviors of machine learning models. As its core, the metric consists of a new similarity metric between score distributions that balances both their general shapes and tails' probabilities. In that sense, our proposed metric may be useful in many application areas. Here we focus on and apply it to the operational evaluation of face recognition systems, with special attention to quantifying demographic biases; an application where our metric is especially useful. The topic of demographic bias and fairness in biometric recognition systems has gained major attention in recent years. The usage of these systems has spread in society, raising concerns about the extent to which these systems treat different population groups. A relevant step to prevent and mitigate demographic biases is first to detect and quantify them. Traditionally, two approaches have been studied to quantify differences between population groups in machine learning literature: 1) measuring differences in error rates, and 2) measuring differences in recognition score distributions. Our proposed Comprehensive Equity Index (CEI) trade-offs both approaches combining both errors from distribution tails and general distribution shapes. This new metric is well suited to real-world scenarios, as measured on NIST FRVT evaluations, involving high-performance systems and realistic face databases including a wide range of covariates and demographic groups. We first show the limitations of existing metrics to correctly assess the presence of biases in realistic setups and then propose our new metric to tackle these limitations. We tested the proposed metric with two state-of-the-art models and four widely used databases, showing its capacity to overcome the main flaws of previous bias metrics.
Related papers
- Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Fairness Index Measures to Evaluate Bias in Biometric Recognition [0.0]
A quantitative evaluation of demographic fairness is an important step towards understanding, assessment, and mitigation of demographic bias in biometric applications.
We introduce multiple measures, based on the statistical characteristics of score distributions, for the evaluation of demographic fairness of a generic biometric verification system.
arXiv Detail & Related papers (2023-06-19T13:28:37Z) - Metrics for Dataset Demographic Bias: A Case Study on Facial Expression Recognition [4.336779198334903]
One of the most prominent types of demographic bias are statistical imbalances in the representation of demographic groups in the datasets.
We develop a taxonomy for the classification of these metrics, providing a practical guide for the selection of appropriate metrics.
The paper provides valuable insights for researchers in AI and related fields to mitigate dataset bias and improve the fairness and accuracy of AI models.
arXiv Detail & Related papers (2023-03-28T11:04:18Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Analysis and Comparison of Classification Metrics [12.092755413404245]
Metrics for measuring the quality of system scores include the area under the ROC curve, equal error rate, cross-entropy, Brier score, and Bayes EC or Bayes risk.
We show how to use these metrics to compute a system's calibration loss and compare this metric with the widely-used expected calibration error (ECE)
arXiv Detail & Related papers (2022-09-12T16:06:10Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural
Networks [7.763173131630868]
We propose two metrics to quantitatively evaluate the class-wise bias of two models in comparison to one another.
By evaluating the performance of these new metrics and by demonstrating their practical application, we show that they can be used to measure fairness as well as bias.
arXiv Detail & Related papers (2021-10-08T22:35:34Z) - Domain-Incremental Continual Learning for Mitigating Bias in Facial
Expression and Action Unit Recognition [5.478764356647437]
We propose the novel usage of Continual Learning (CL) as a potent bias mitigation method to enhance the fairness of FER systems.
We compare different non-CL-based and CL-based methods for their classification accuracy and fairness scores on expression recognition and Action Unit (AU) detection tasks.
Our experimental results show that CL-based methods, on average, outperform other popular bias mitigation techniques on both accuracy and fairness metrics.
arXiv Detail & Related papers (2021-03-15T18:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.