Algorithmic Factors Influencing Bias in Machine Learning
- URL: http://arxiv.org/abs/2104.14014v1
- Date: Wed, 28 Apr 2021 20:45:41 GMT
- Title: Algorithmic Factors Influencing Bias in Machine Learning
- Authors: William Blanzeisky, P\'adraig Cunningham
- Abstract summary: We show how irreducible error, regularization and feature and class imbalance can contribute to this underestimation.
The paper concludes with a demonstration of how the careful management of synthetic counterfactuals can ameliorate the impact of this underestimation bias.
- Score: 2.055949720959582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is fair to say that many of the prominent examples of bias in Machine
Learning (ML) arise from bias that is there in the training data. In fact, some
would argue that supervised ML algorithms cannot be biased, they reflect the
data on which they are trained. In this paper we demonstrate how ML algorithms
can misrepresent the training data through underestimation. We show how
irreducible error, regularization and feature and class imbalance can
contribute to this underestimation. The paper concludes with a demonstration of
how the careful management of synthetic counterfactuals can ameliorate the
impact of this underestimation bias.
Related papers
- Debiasing Machine Unlearning with Counterfactual Examples [31.931056076782202]
We analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels.
We introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset.
Our method outperforms existing machine unlearning baselines on evaluation metrics.
arXiv Detail & Related papers (2024-04-24T09:33:10Z) - Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias.
We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates.
We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Understanding Unfairness in Fraud Detection through Model and Data Bias
Interactions [4.159343412286401]
We argue that algorithmic unfairness stems from interactions between models and biases in the data.
We study a set of hypotheses regarding the fairness-accuracy trade-offs that fairness-blind ML algorithms exhibit under different data bias settings.
arXiv Detail & Related papers (2022-07-13T15:18:30Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - On Statistical Bias In Active Learning: How and When To Fix It [42.768124675364376]
Active learning is a powerful tool when labelling data is expensive.
It introduces a bias because the training data no longer follows the population distribution.
We formalize this bias and investigate the situations in which it can be harmful and sometimes even helpful.
arXiv Detail & Related papers (2021-01-27T19:52:24Z) - Fairness Constraints in Semi-supervised Learning [56.48626493765908]
We develop a framework for fair semi-supervised learning, which is formulated as an optimization problem.
We theoretically analyze the source of discrimination in semi-supervised learning via bias, variance and noise decomposition.
Our method is able to achieve fair semi-supervised learning, and reach a better trade-off between accuracy and fairness than fair supervised learning.
arXiv Detail & Related papers (2020-09-14T04:25:59Z) - Underestimation Bias and Underfitting in Machine Learning [2.639737913330821]
What is termed algorithmic bias in machine learning will be due to historic bias in the training data.
Sometimes the bias may be introduced (or at least exacerbated) by the algorithm itself.
In this paper we report on initial research to understand the factors that contribute to bias in classification algorithms.
arXiv Detail & Related papers (2020-05-18T20:01:56Z) - Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning.
In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data.
The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z) - Recovering from Biased Data: Can Fairness Constraints Improve Accuracy? [11.435833538081557]
Empirical Risk Minimization (ERM) may produce a classifier that not only is biased but also has suboptimal accuracy on the true data distribution.
We examine the ability of fairness-constrained ERM to correct this problem.
We also consider other recovery methods including reweighting the training data, Equalized Odds, and Demographic Parity.
arXiv Detail & Related papers (2019-12-02T22:00:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.