Related papers: Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine Learning System Decisions

Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine Learning System Decisions

URL: http://arxiv.org/abs/2406.17906v1
Date: Tue, 25 Jun 2024 19:40:55 GMT
Title: Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine Learning System Decisions
Authors: Hussaini Mamman, Shuib Basri, Abdullateef Balogun, Abubakar Abdullahi Imam, Ganesh Kumar, Luiz Fernando Capretz,
Abstract summary: We propose a novel framework for on-the-fly tracking and correction of discrimination in deployed ML systems. The framework continuously monitors the predictions made by an ML system and flags discriminatory outcomes. This human-in-the-loop approach empowers reviewers to accept or override the ML system decision.
Score: 4.24106429730184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The widespread adoption of ML systems across critical domains like hiring, finance, and healthcare raises growing concerns about their potential for discriminatory decision-making based on protected attributes. While efforts to ensure fairness during development are crucial, they leave deployed ML systems vulnerable to potentially exhibiting discrimination during their operations. To address this gap, we propose a novel framework for on-the-fly tracking and correction of discrimination in deployed ML systems. Leveraging counterfactual explanations, the framework continuously monitors the predictions made by an ML system and flags discriminatory outcomes. When flagged, post-hoc explanations related to the original prediction and the counterfactual alternatives are presented to a human reviewer for real-time intervention. This human-in-the-loop approach empowers reviewers to accept or override the ML system decision, enabling fair and responsible ML operation under dynamic settings. While further work is needed for validation and refinement, this framework offers a promising avenue for mitigating discrimination and building trust in ML systems deployed in a wide range of domains.

Related papers

Interpretable and Fair Mechanisms for Abstaining Classifiers [12.539170258479778]
We introduce Interpretable and Fair Abstaining, an algorithm that can reject both predictions based on uncertainty and their unfairness. Our method reduces error and positive decision rate differences across demographic groups the non-rejected data. This explainable aspect is especially important in light of recent AI regulations, mandating that any high-risk decision task should be overseen by human experts to reduce discrimination risks.
arXiv Detail & Related papers (2025-03-24T16:06:43Z)
Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal [21.342265570934995]
Existing methods have largely overlooked the importance of refusal responses as a means of enhancing MLLMs reliability.<n>We present the Information Boundary-aware Learning Framework (InBoL), a novel approach that empowers MLLMs to refuse to answer user queries when encountering insufficient information.<n>This framework introduces a comprehensive data generation pipeline and tailored training strategies to improve the model's ability to deliver appropriate refusal responses.
arXiv Detail & Related papers (2024-12-15T14:17:14Z)
Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments [50.310636905746975]
Real-world machine learning systems often encounter model performance degradation due to distributional shifts in the underlying data generating process. Existing approaches to addressing shifts, such as concept drift adaptation, are limited by their reason-agnostic nature. We propose self-healing machine learning (SHML) to overcome these limitations.
arXiv Detail & Related papers (2024-10-31T20:05:51Z)
InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback [70.54226917774933]
This paper introduces InferAct, a novel approach to proactively detect potential errors before risky actions are executed. InferAct acts as a human proxy, detecting unsafe actions and alerting users for intervention. Experiments on three widely-used tasks demonstrate the effectiveness of InferAct.
arXiv Detail & Related papers (2024-07-16T15:24:44Z)
Formalising Anti-Discrimination Law in Automated Decision Systems [1.560976479364936]
We introduce a novel decision-theoretic framework grounded in anti-discrimination law of the United Kingdom. We propose the 'conditional estimation parity' metric, which accounts for estimation error and the underlying data-generating process. Our approach bridges the divide between machine learning fairness metrics and anti-discrimination law, offering a legally grounded framework for developing non-discriminatory automated decision systems.
arXiv Detail & Related papers (2024-06-29T10:59:21Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) are used to automate decision-making tasks. In this paper, we evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types. These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts.
arXiv Detail & Related papers (2024-04-08T14:15:56Z)
Uncertainty-aware predictive modeling for fair data-driven decisions [5.371337604556311]
We show how fairML systems can be safeML systems. For fair decisions, we argue that a safe fail option should be used for individuals with uncertain categorization.
arXiv Detail & Related papers (2022-11-04T20:04:39Z)
Causal Fairness Analysis [68.12191782657437]
We introduce a framework for understanding, modeling, and possibly solving issues of fairness in decision-making settings. The main insight of our approach will be to link the quantification of the disparities present on the observed data with the underlying, and often unobserved, collection of causal mechanisms. Our effort culminates in the Fairness Map, which is the first systematic attempt to organize and explain the relationship between different criteria found in the literature.
arXiv Detail & Related papers (2022-07-23T01:06:34Z)
Marrying Fairness and Explainability in Supervised Learning [0.0]
We formalize direct discrimination as a direct causal effect of the protected attributes on the decisions. We find that state-of-the-art fair learning methods can induce discrimination via association or reverse discrimination. We propose to nullify the influence of the protected attribute on the output of the system, while preserving the influence of remaining features.
arXiv Detail & Related papers (2022-04-06T17:26:58Z)
Fairness-aware Adversarial Perturbation Towards Bias Mitigation for Deployed Deep Models [32.39167033858135]
Prioritizing fairness is of central importance in artificial intelligence (AI) systems. We propose a more flexible approach, i.e., fairness-aware adversarial perturbation (FAAP) FAAP learns to perturb input data to blind deployed models on fairness-related features.
arXiv Detail & Related papers (2022-03-03T09:26:00Z)
Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems [1.4695979686066065]
Development and deployment of machine learning systems remains a challenge. In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development.
arXiv Detail & Related papers (2021-03-25T19:40:29Z)
Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap. We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z)
When Does Uncertainty Matter?: Understanding the Impact of Predictive Uncertainty in ML Assisted Decision Making [68.19284302320146]
We carry out user studies to assess how people with differing levels of expertise respond to different types of predictive uncertainty. We found that showing posterior predictive distributions led to smaller disagreements with the ML model's predictions. This suggests that posterior predictive distributions can potentially serve as useful decision aids which should be used with caution and take into account the type of distribution and the expertise of the human.
arXiv Detail & Related papers (2020-11-12T02:23:53Z)
Towards Integrating Fairness Transparently in Industrial Applications [3.478469381434812]
We propose a systematic approach to integrate mechanized and human-in-the-loop components in bias detection, mitigation, and documentation of Machine Learning projects. We present our structural primitives with an example real-world use case on how it can be used to identify potential biases and determine appropriate mitigation strategies.
arXiv Detail & Related papers (2020-06-10T21:54:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.