Related papers: Algorithmic Factors Influencing Bias in Machine Learning

Algorithmic Factors Influencing Bias in Machine Learning

URL: http://arxiv.org/abs/2104.14014v1
Date: Wed, 28 Apr 2021 20:45:41 GMT
Title: Algorithmic Factors Influencing Bias in Machine Learning
Authors: William Blanzeisky, P\'adraig Cunningham
Abstract summary: We show how irreducible error, regularization and feature and class imbalance can contribute to this underestimation. The paper concludes with a demonstration of how the careful management of synthetic counterfactuals can ameliorate the impact of this underestimation bias.
Score: 2.055949720959582
License: http://creativecommons.org/licenses/by/4.0/
Abstract: It is fair to say that many of the prominent examples of bias in Machine Learning (ML) arise from bias that is there in the training data. In fact, some would argue that supervised ML algorithms cannot be biased, they reflect the data on which they are trained. In this paper we demonstrate how ML algorithms can misrepresent the training data through underestimation. We show how irreducible error, regularization and feature and class imbalance can contribute to this underestimation. The paper concludes with a demonstration of how the careful management of synthetic counterfactuals can ameliorate the impact of this underestimation bias.

Related papers

How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs. Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
Debiasing Machine Unlearning with Counterfactual Examples [31.931056076782202]
We analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels. We introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset. Our method outperforms existing machine unlearning baselines on evaluation metrics.
arXiv Detail & Related papers (2024-04-24T09:33:10Z)
Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias. We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates. We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Understanding Unfairness in Fraud Detection through Model and Data Bias Interactions [4.159343412286401]
We argue that algorithmic unfairness stems from interactions between models and biases in the data. We study a set of hypotheses regarding the fairness-accuracy trade-offs that fairness-blind ML algorithms exhibit under different data bias settings.
arXiv Detail & Related papers (2022-07-13T15:18:30Z)
Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group. We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z)
Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning. We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class. We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z)
On Statistical Bias In Active Learning: How and When To Fix It [42.768124675364376]
Active learning is a powerful tool when labelling data is expensive. It introduces a bias because the training data no longer follows the population distribution. We formalize this bias and investigate the situations in which it can be harmful and sometimes even helpful.
arXiv Detail & Related papers (2021-01-27T19:52:24Z)
Fairness Constraints in Semi-supervised Learning [56.48626493765908]
We develop a framework for fair semi-supervised learning, which is formulated as an optimization problem. We theoretically analyze the source of discrimination in semi-supervised learning via bias, variance and noise decomposition. Our method is able to achieve fair semi-supervised learning, and reach a better trade-off between accuracy and fairness than fair supervised learning.
arXiv Detail & Related papers (2020-09-14T04:25:59Z)
Underestimation Bias and Underfitting in Machine Learning [2.639737913330821]
What is termed algorithmic bias in machine learning will be due to historic bias in the training data. Sometimes the bias may be introduced (or at least exacerbated) by the algorithm itself. In this paper we report on initial research to understand the factors that contribute to bias in classification algorithms.
arXiv Detail & Related papers (2020-05-18T20:01:56Z)
Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning. In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data. The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z)
Recovering from Biased Data: Can Fairness Constraints Improve Accuracy? [11.435833538081557]
Empirical Risk Minimization (ERM) may produce a classifier that not only is biased but also has suboptimal accuracy on the true data distribution. We examine the ability of fairness-constrained ERM to correct this problem. We also consider other recovery methods including reweighting the training data, Equalized Odds, and Demographic Parity.
arXiv Detail & Related papers (2019-12-02T22:00:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.