Fairness Evaluation in Presence of Biased Noisy Labels
- URL: http://arxiv.org/abs/2003.13808v1
- Date: Mon, 30 Mar 2020 20:47:00 GMT
- Title: Fairness Evaluation in Presence of Biased Noisy Labels
- Authors: Riccardo Fogliato, Max G'Sell, Alexandra Chouldechova
- Abstract summary: We propose a sensitivity analysis framework for assessing how assumptions on the noise across groups affect the predictive bias properties of the risk assessment model.
Our experimental results on two real world criminal justice data sets demonstrate how even small biases in the observed labels may call into question the conclusions of an analysis based on the noisy outcome.
- Score: 84.12514975093826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Risk assessment tools are widely used around the country to inform decision
making within the criminal justice system. Recently, considerable attention has
been devoted to the question of whether such tools may suffer from racial bias.
In this type of assessment, a fundamental issue is that the training and
evaluation of the model is based on a variable (arrest) that may represent a
noisy version of an unobserved outcome of more central interest (offense). We
propose a sensitivity analysis framework for assessing how assumptions on the
noise across groups affect the predictive bias properties of the risk
assessment model as a predictor of reoffense. Our experimental results on two
real world criminal justice data sets demonstrate how even small biases in the
observed labels may call into question the conclusions of an analysis based on
the noisy outcome.
Related papers
- Achieving Fairness in Predictive Process Analytics via Adversarial Learning [50.31323204077591]
This paper addresses the challenge of integrating a debiasing phase into predictive business process analytics.
Our framework leverages on adversial debiasing is evaluated on four case studies, showing a significant reduction in the contribution of biased variables to the predicted value.
arXiv Detail & Related papers (2024-10-03T15:56:03Z) - Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - OffsetBias: Leveraging Debiased Data for Tuning Evaluators [1.5790747258969664]
We qualitatively identify six types of biases inherent in various judge models.
Fine-tuning on our dataset significantly enhances the robustness of judge models against biases.
arXiv Detail & Related papers (2024-07-09T05:16:22Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Robust Design and Evaluation of Predictive Algorithms under Unobserved Confounding [2.8498944632323755]
We propose a unified framework for the robust design and evaluation of predictive algorithms in selectively observed data.
We impose general assumptions on how much the outcome may vary on average between unselected and selected units.
We develop debiased machine learning estimators for the bounds on a large class of predictive performance estimands.
arXiv Detail & Related papers (2022-12-19T20:41:44Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - Independent Ethical Assessment of Text Classification Models: A Hate
Speech Detection Case Study [0.5541644538483947]
An independent ethical assessment of an artificial intelligence system is an impartial examination of the system's development, deployment, and use in alignment with ethical values.
This study bridges this gap and designs a holistic independent ethical assessment process for a text classification model with a special focus on the task of hate speech detection.
arXiv Detail & Related papers (2021-07-19T23:03:36Z) - Through the Data Management Lens: Experimental Analysis and Evaluation
of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness.
We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability.
Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z) - Feedback Effects in Repeat-Use Criminal Risk Assessments [0.0]
We show that risk can propagate over sequential decisions in ways that are not captured by one-shot tests.
Risk assessment tools operate in a highly complex and path-dependent process, fraught with historical inequity.
arXiv Detail & Related papers (2020-11-28T06:40:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.