Related papers: Balancing Tails when Comparing Distributions: Comprehensive Equity Index (CEI) with Application to Bias Evaluation in Operational Face Biometrics

Balancing Tails when Comparing Distributions: Comprehensive Equity Index (CEI) with Application to Bias Evaluation in Operational Face Biometrics

URL: http://arxiv.org/abs/2506.10564v1
Date: Thu, 12 Jun 2025 10:43:31 GMT
Title: Balancing Tails when Comparing Distributions: Comprehensive Equity Index (CEI) with Application to Bias Evaluation in Operational Face Biometrics
Authors: Imanol Solano, Julian Fierrez, Aythami Morales, Alejandro Peña, Ruben Tolosana, Francisco Zamora-Martinez, Javier San Agustin,
Abstract summary: Comprehensive Equity Index (CEI) is a novel metric designed to detect demographic bias in face recognition systems.<n>Our experiments confirm CEI's superior ability to detect nuanced biases where previous methods fall short.<n>CEI provides a robust and sensitive tool for operational fairness assessment.
Score: 47.762333925222926
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Demographic bias in high-performance face recognition (FR) systems often eludes detection by existing metrics, especially with respect to subtle disparities in the tails of the score distribution. We introduce the Comprehensive Equity Index (CEI), a novel metric designed to address this limitation. CEI uniquely analyzes genuine and impostor score distributions separately, enabling a configurable focus on tail probabilities while also considering overall distribution shapes. Our extensive experiments (evaluating state-of-the-art FR systems, intentionally biased models, and diverse datasets) confirm CEI's superior ability to detect nuanced biases where previous methods fall short. Furthermore, we present CEI^A, an automated version of the metric that enhances objectivity and simplifies practical application. CEI provides a robust and sensitive tool for operational FR fairness assessment. The proposed methods have been developed particularly for bias evaluation in face biometrics but, in general, they are applicable for comparing statistical distributions in any problem where one is interested in analyzing the distribution tails.

Related papers

Fair Deepfake Detectors Can Generalize [51.21167546843708]
We show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions.<n>Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals.<n>DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art
arXiv Detail & Related papers (2025-07-03T14:10:02Z)
Mitigating Bias in Facial Recognition Systems: Centroid Fairness Loss Optimization [9.537960917804993]
societal demand for fair AI systems has put pressure on the research community to develop predictive models that meet new fairness criteria.<n>In particular, the variability of the errors made by certain Facial Recognition (FR) systems across specific segments of the population compromises the deployment of the latter.<n>We propose a novel post-processing approach to improve the fairness of pre-trained FR models by optimizing a regression loss which acts on centroid-based scores.
arXiv Detail & Related papers (2025-04-27T22:17:44Z)
Comprehensive Equity Index (CEI): Definition and Application to Bias Evaluation in Biometrics [47.762333925222926]
We present a novel metric to quantify biased behaviors of machine learning models. We focus on and apply it to the operational evaluation of face recognition systems.
arXiv Detail & Related papers (2024-09-03T14:19:38Z)
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data [11.42231457116486]
FedAD-Bench is a benchmark for evaluating unsupervised anomaly detection algorithms within the context of federated learning. We identify key challenges such as model aggregation inefficiencies and metric unreliability. Our work aims to establish a standardized benchmark to guide future research and development in federated anomaly detection.
arXiv Detail & Related papers (2024-08-08T13:14:19Z)
Identifying and Mitigating Social Bias Knowledge in Language Models [52.52955281662332]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.<n>FAST surpasses state-of-the-art baselines with superior debiasing performance.<n>This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Individual Fairness Through Reweighting and Tuning [0.23395944472515745]
Inherent bias within society can be amplified and perpetuated by artificial intelligence (AI) systems. Recently, Graph Laplacian Regularizer (GLR) has been used as a substitute for the common Lipschitz condition to enhance individual fairness. In this work, we investigated whether defining a GLR independently on the train and target data could maintain similar accuracy.
arXiv Detail & Related papers (2024-05-02T20:15:25Z)
Evaluating Probabilistic Classifiers: The Triptych [62.997667081978825]
We propose and study a triptych of diagnostic graphics that focus on distinct and complementary aspects of forecast performance. The reliability diagram addresses calibration, the receiver operating characteristic (ROC) curve diagnoses discrimination ability, and the Murphy diagram visualizes overall predictive performance and value.
arXiv Detail & Related papers (2023-01-25T19:35:23Z)
Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision. Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Domain-Incremental Continual Learning for Mitigating Bias in Facial Expression and Action Unit Recognition [5.478764356647437]
We propose the novel usage of Continual Learning (CL) as a potent bias mitigation method to enhance the fairness of FER systems. We compare different non-CL-based and CL-based methods for their classification accuracy and fairness scores on expression recognition and Action Unit (AU) detection tasks. Our experimental results show that CL-based methods, on average, outperform other popular bias mitigation techniques on both accuracy and fairness metrics.
arXiv Detail & Related papers (2021-03-15T18:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.