Bias in Machine Learning Software: Why? How? What to do?
- URL: http://arxiv.org/abs/2105.12195v1
- Date: Tue, 25 May 2021 20:15:50 GMT
- Title: Bias in Machine Learning Software: Why? How? What to do?
- Authors: Joymallya Chakraborty, Suvodeep Majumder, Tim Menzies
- Abstract summary: This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples.
Our Fair-SMOTE algorithm removes biased labels; and rebalances internal distributions such that based on sensitive attribute, examples are equal in both positive and negative classes.
- Score: 15.525314212209564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Increasingly, software is making autonomous decisions in case of criminal
sentencing, approving credit cards, hiring employees, and so on. Some of these
decisions show bias and adversely affect certain social groups (e.g. those
defined by sex, race, age, marital status). Many prior works on bias mitigation
take the following form: change the data or learners in multiple ways, then see
if any of that improves fairness. Perhaps a better approach is to postulate
root causes of bias and then applying some resolution strategy. This paper
postulates that the root causes of bias are the prior decisions that affect-
(a) what data was selected and (b) the labels assigned to those examples. Our
Fair-SMOTE algorithm removes biased labels; and rebalances internal
distributions such that based on sensitive attribute, examples are equal in
both positive and negative classes. On testing, it was seen that this method
was just as effective at reducing bias as prior approaches. Further, models
generated via Fair-SMOTE achieve higher performance (measured in terms of
recall and F1) than other state-of-the-art fairness improvement algorithms. To
the best of our knowledge, measured in terms of number of analyzed learners and
datasets, this study is one of the largest studies on bias mitigation yet
presented in the literature.
Related papers
- A Principled Approach for a New Bias Measure [7.352247786388098]
We propose the definition of Uniform Bias (UB), the first bias measure with a clear and simple interpretation in the full range of bias values.
Our results are experimentally validated using nine publicly available datasets and theoretically analyzed, which provide novel insights about the problem.
Based on our approach, we also design a bias mitigation model that might be useful to policymakers.
arXiv Detail & Related papers (2024-05-20T18:14:33Z) - How to be fair? A study of label and selection bias [3.018638214344819]
It is widely accepted that biased data leads to biased and potentially unfair models.
Several measures for bias in data and model predictions have been proposed, as well as bias mitigation techniques.
Despite the myriad of mitigation techniques developed in the past decade, it is still poorly understood under what circumstances which methods work.
arXiv Detail & Related papers (2024-03-21T10:43:55Z) - Classes Are Not Equal: An Empirical Study on Image Recognition Fairness [100.36114135663836]
We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets.
Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize.
Data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification.
arXiv Detail & Related papers (2024-02-28T07:54:50Z) - Dissecting Causal Biases [0.0]
This paper focuses on a class of bias originating in the way training data is generated and/or collected.
Four sources of bias are considered, namely, confounding, selection, measurement, and interaction.
arXiv Detail & Related papers (2023-10-20T09:12:10Z) - Mitigating Bias for Question Answering Models by Tracking Bias Influence [84.66462028537475]
We propose BMBI, an approach to mitigate the bias of multiple-choice QA models.
Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance.
We show that our method could be applied to multiple QA formulations across multiple bias categories.
arXiv Detail & Related papers (2023-10-13T00:49:09Z) - Fairness and Bias in Truth Discovery Algorithms: An Experimental
Analysis [7.575734557466221]
Crowd workers may sometimes provide unreliable labels.
Truth discovery (TD) algorithms are applied to determine the consensus labels from conflicting worker responses.
We conduct a systematic study of the bias and fairness of TD algorithms.
arXiv Detail & Related papers (2023-04-25T04:56:35Z) - When mitigating bias is unfair: multiplicity and arbitrariness in algorithmic group fairness [8.367620276482056]
We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions.
Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods.
These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.
arXiv Detail & Related papers (2023-02-14T16:53:52Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.