Model updating after interventions paradoxically introduces bias
- URL: http://arxiv.org/abs/2010.11530v2
- Date: Mon, 22 Feb 2021 08:55:14 GMT
- Title: Model updating after interventions paradoxically introduces bias
- Authors: James Liley and Samuel R Emerson and Bilal A Mateen and Catalina A
Vallejos and Louis J M Aslett and Sebastian J Vollmer
- Abstract summary: Recent discussions have highlighted potential problems in the updating of a predictive score for a binary outcome.
In this setting, the existing score induces an additional causative pathway which leads to miscalibration when the original score is replaced.
We propose a general causal framework to describe and address this problem, and demonstrate an equivalent formulation as a partially observed Markov decision process.
- Score: 2.089458396525051
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning is increasingly being used to generate prediction models for
use in a number of real-world settings, from credit risk assessment to clinical
decision support. Recent discussions have highlighted potential problems in the
updating of a predictive score for a binary outcome when an existing predictive
score forms part of the standard workflow, driving interventions. In this
setting, the existing score induces an additional causative pathway which leads
to miscalibration when the original score is replaced. We propose a general
causal framework to describe and address this problem, and demonstrate an
equivalent formulation as a partially observed Markov decision process. We use
this model to demonstrate the impact of such `naive updating' when performed
repeatedly. Namely, we show that successive predictive scores may converge to a
point where they predict their own effect, or may eventually tend toward a
stable oscillation between two values, and we argue that neither outcome is
desirable. Furthermore, we demonstrate that even if model-fitting procedures
improve, actual performance may worsen. We complement these findings with a
discussion of several potential routes to overcome these issues.
Related papers
- Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Measuring the Stability of Process Outcome Predictions in Online
Settings [4.599862571197789]
This paper proposes an evaluation framework for assessing the stability of models for online predictive process monitoring.
The framework introduces four performance meta-measures: the frequency of significant performance drops, the magnitude of such drops, the recovery rate, and the volatility of performance.
The results demonstrate that these meta-measures facilitate the comparison and selection of predictive models for different risk-taking scenarios.
arXiv Detail & Related papers (2023-10-13T10:37:46Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - A Closer Look at the Intervention Procedure of Concept Bottleneck Models [18.222350428973343]
Concept bottleneck models (CBMs) are a class of interpretable neural network models that predict the target response of a given input based on its high-level concepts.
CBMs enable domain experts to intervene on the predicted concepts and rectify any mistakes at test time, so that more accurate task predictions can be made at the end.
We develop various ways of selecting intervening concepts to improve the intervention effectiveness and conduct an array of in-depth analyses as to how they evolve under different circumstances.
arXiv Detail & Related papers (2023-02-28T02:37:24Z) - Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning [57.4036085386653]
We show that prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inferences based on lexical overlap.
We then show that adding a regularization that preserves pretraining weights is effective in mitigating this destructive tendency of few-shot finetuning.
arXiv Detail & Related papers (2021-09-09T10:10:29Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Towards Robust and Reliable Algorithmic Recourse [11.887537452826624]
We propose a novel framework, RObust Algorithmic Recourse (ROAR), that leverages adversarial training for finding recourses that are robust to model shifts.
We also carry out detailed theoretical analysis which underscores the importance of constructing recourses that are robust to model shifts.
arXiv Detail & Related papers (2021-02-26T17:38:52Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z) - Estimating the Effects of Continuous-valued Interventions using
Generative Adversarial Networks [103.14809802212535]
We build on the generative adversarial networks (GANs) framework to address the problem of estimating the effect of continuous-valued interventions.
Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions.
To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator.
arXiv Detail & Related papers (2020-02-27T18:46:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.