Dissecting Causal Biases
- URL: http://arxiv.org/abs/2310.13364v1
- Date: Fri, 20 Oct 2023 09:12:10 GMT
- Title: Dissecting Causal Biases
- Authors: R\=uta Binkyt\.e, Sami Zhioua, Yassine Turki
- Abstract summary: This paper focuses on a class of bias originating in the way training data is generated and/or collected.
Four sources of bias are considered, namely, confounding, selection, measurement, and interaction.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurately measuring discrimination in machine learning-based automated
decision systems is required to address the vital issue of fairness between
subpopulations and/or individuals. Any bias in measuring discrimination can
lead to either amplification or underestimation of the true value of
discrimination. This paper focuses on a class of bias originating in the way
training data is generated and/or collected. We call such class causal biases
and use tools from the field of causality to formally define and analyze such
biases. Four sources of bias are considered, namely, confounding, selection,
measurement, and interaction. The main contribution of this paper is to
provide, for each source of bias, a closed-form expression in terms of the
model parameters. This makes it possible to analyze the behavior of each source
of bias, in particular, in which cases they are absent and in which other cases
they are maximized. We hope that the provided characterizations help the
community better understand the sources of bias in machine learning
applications.
Related papers
- Is There a One-Model-Fits-All Approach to Information Extraction? Revisiting Task Definition Biases [62.806300074459116]
Definition bias is a negative phenomenon that can mislead models.
We identify two types of definition bias in IE: bias among information extraction datasets and bias between information extraction datasets and instruction tuning datasets.
We propose a multi-stage framework consisting of definition bias measurement, bias-aware fine-tuning, and task-specific bias mitigation.
arXiv Detail & Related papers (2024-03-25T03:19:20Z) - A survey on bias in machine learning research [0.0]
Current research on bias in machine learning often focuses on fairness, while overlooking the roots or causes of bias.
This article aims to bridge the gap between past literature on bias in research by providing taxonomy for potential sources of bias and errors in data and models.
arXiv Detail & Related papers (2023-08-22T07:56:57Z) - Shedding light on underrepresentation and Sampling Bias in machine
learning [0.0]
We show how discrimination can be decomposed into variance, bias, and noise.
We challenge the commonly accepted mitigation approach that discrimination can be addressed by collecting more samples of the underrepresented group.
arXiv Detail & Related papers (2023-06-08T09:34:20Z) - Data Bias Management [17.067962372238135]
We show how bias in data affects end users, where bias is originated, and provide a viewpoint about what we should do about it.
We argue that data bias is not something that should necessarily be removed in all cases, and that research attention should instead shift from bias removal to bias management.
arXiv Detail & Related papers (2023-05-15T10:07:27Z) - Fair Enough: Standardizing Evaluation and Model Selection for Fairness
Research in NLP [64.45845091719002]
Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct.
This paper seeks to clarify the current situation and plot a course for meaningful progress in fair learning.
arXiv Detail & Related papers (2023-02-11T14:54:00Z) - Prisoners of Their Own Devices: How Models Induce Data Bias in
Performative Prediction [4.874780144224057]
A biased model can make decisions that disproportionately harm certain groups in society.
Much work has been devoted to measuring unfairness in static ML environments, but not in dynamic, performative prediction ones.
We propose a taxonomy to characterize bias in the data, and study cases where it is shaped by model behaviour.
arXiv Detail & Related papers (2022-06-27T10:56:04Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - Gradient Based Activations for Accurate Bias-Free Learning [22.264226961225003]
We show that a biased discriminator can actually be used to improve this bias-accuracy tradeoff.
Specifically, this is achieved by using a feature masking approach using the discriminator's gradients.
We show that this simple approach works well to reduce bias as well as improve accuracy significantly.
arXiv Detail & Related papers (2022-02-17T00:30:40Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.