Related papers: Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural Networks

Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural Networks

URL: http://arxiv.org/abs/2110.04397v1
Date: Fri, 8 Oct 2021 22:35:34 GMT
Title: Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural Networks
Authors: Cody Blakeney, Gentry Atkinson, Nathaniel Huish, Yan Yan, Vangelis Metris, Ziliang Zong
Abstract summary: We propose two metrics to quantitatively evaluate the class-wise bias of two models in comparison to one another. By evaluating the performance of these new metrics and by demonstrating their practical application, we show that they can be used to measure fairness as well as bias.
Score: 7.763173131630868
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Algorithmic bias is of increasing concern, both to the research community, and society at large. Bias in AI is more abstract and unintuitive than traditional forms of discrimination and can be more difficult to detect and mitigate. A clear gap exists in the current literature on evaluating the relative bias in the performance of multi-class classifiers. In this work, we propose two simple yet effective metrics, Combined Error Variance (CEV) and Symmetric Distance Error (SDE), to quantitatively evaluate the class-wise bias of two models in comparison to one another. By evaluating the performance of these new metrics and by demonstrating their practical application, we show that they can be used to measure fairness as well as bias. These demonstrations show that our metrics can address specific needs for measuring bias in multi-class classification.

Related papers

On the Origins of Sampling Bias: Implications on Fairness Measurement and Mitigation [0.0]
Several sources of bias exist and it is assumed that bias resulting from machine learning is born equally by different groups. Sampling bias, in particular, is inconsistently used in the literature to describe bias due to the sampling procedure. We introduce clearly defined variants of sampling bias, namely, sample size bias ( SSB) and underrepresentation bias (URB)
arXiv Detail & Related papers (2025-03-23T06:23:07Z)
A Critical Review of Predominant Bias in Neural Networks [19.555188118439883]
We find that there exists a persistent, extensive but under-explored confusion regarding these two types of biases. We aim to restore clarity by providing two mathematical definitions for these two predominant biases and leveraging these definitions to unify a comprehensive list of papers.
arXiv Detail & Related papers (2025-02-16T07:55:19Z)
Comprehensive Equity Index (CEI): Definition and Application to Bias Evaluation in Biometrics [47.762333925222926]
We present a novel metric to quantify biased behaviors of machine learning models. We focus on and apply it to the operational evaluation of face recognition systems.
arXiv Detail & Related papers (2024-09-03T14:19:38Z)
Does Machine Bring in Extra Bias in Learning? Approximating Fairness in Models Promptly [2.002741592555996]
Existing techniques for assessing the discrimination level of machine learning models include commonly used group and individual fairness measures. We propose a "harmonic fairness measure via manifold (HFM)" based on distances between sets. Empirical results indicate that the proposed fairness measure HFM is valid and that the proposed ApproxDist is effective and efficient.
arXiv Detail & Related papers (2024-05-15T11:07:40Z)
Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP [64.45845091719002]
Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct. This paper seeks to clarify the current situation and plot a course for meaningful progress in fair learning.
arXiv Detail & Related papers (2023-02-11T14:54:00Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
Ensembling over Classifiers: a Bias-Variance Perspective [13.006468721874372]
We build upon the extension to the bias-variance decomposition by Pfau (2013) in order to gain crucial insights into the behavior of ensembles of classifiers. We show that conditional estimates necessarily incur an irreducible error. Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction.
arXiv Detail & Related papers (2022-06-21T17:46:35Z)
The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings. We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z)
Information-Theoretic Bias Reduction via Causal View of Spurious Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation. We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss. The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z)
Information-Theoretic Bias Assessment Of Learned Representations Of Pretrained Face Recognition [18.07966649678408]
We propose an information-theoretic, independent bias assessment metric to identify degree of bias against protected demographic attributes. Our metric differs from other methods that rely on classification accuracy or examine the differences between ground truth and predicted labels of protected attributes predicted using a shallow network.
arXiv Detail & Related papers (2021-11-08T17:41:17Z)
Fairness-aware Class Imbalanced Learning [57.45784950421179]
We evaluate long-tail learning methods for tweet sentiment and occupation classification. We extend a margin-loss based approach with methods to enforce fairness.
arXiv Detail & Related papers (2021-09-21T22:16:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.