A Causal Lens for Peeking into Black Box Predictive Models: Predictive
Model Interpretation via Causal Attribution
- URL: http://arxiv.org/abs/2008.00357v1
- Date: Sat, 1 Aug 2020 23:20:57 GMT
- Title: A Causal Lens for Peeking into Black Box Predictive Models: Predictive
Model Interpretation via Causal Attribution
- Authors: Aria Khademi, Vasant Honavar
- Abstract summary: We aim to address this problem in settings where the predictive model is a black box.
We reduce the problem of interpreting a black box predictive model to that of estimating the causal effects of each of the model inputs on the model output.
We show how the resulting causal attribution of responsibility for model output to the different model inputs can be used to interpret the predictive model and to explain its predictions.
- Score: 3.3758186776249928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing adoption of predictive models trained using machine
learning across a wide range of high-stakes applications, e.g., health care,
security, criminal justice, finance, and education, there is a growing need for
effective techniques for explaining such models and their predictions. We aim
to address this problem in settings where the predictive model is a black box;
That is, we can only observe the response of the model to various inputs, but
have no knowledge about the internal structure of the predictive model, its
parameters, the objective function, and the algorithm used to optimize the
model. We reduce the problem of interpreting a black box predictive model to
that of estimating the causal effects of each of the model inputs on the model
output, from observations of the model inputs and the corresponding outputs. We
estimate the causal effects of model inputs on model output using variants of
the Rubin Neyman potential outcomes framework for estimating causal effects
from observational data. We show how the resulting causal attribution of
responsibility for model output to the different model inputs can be used to
interpret the predictive model and to explain its predictions. We present
results of experiments that demonstrate the effectiveness of our approach to
the interpretation of black box predictive models via causal attribution in the
case of deep neural network models trained on one synthetic data set (where the
input variables that impact the output variable are known by design) and two
real-world data sets: Handwritten digit classification, and Parkinson's disease
severity prediction. Because our approach does not require knowledge about the
predictive model algorithm and is free of assumptions regarding the black box
predictive model except that its input-output responses be observable, it can
be applied, in principle, to any black box predictive model.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - A performance characteristic curve for model evaluation: the application
in information diffusion prediction [3.8711489380602804]
We propose a metric based on information entropy to quantify the randomness in diffusion data, then identify a scaling pattern between the randomness and the prediction accuracy of the model.
Data points in the patterns by different sequence lengths, system sizes, and randomness all collapse into a single curve, capturing a model's inherent capability of making correct predictions.
The validity of the curve is tested by three prediction models in the same family, reaching conclusions in line with existing studies.
arXiv Detail & Related papers (2023-09-18T07:32:57Z) - A prediction and behavioural analysis of machine learning methods for
modelling travel mode choice [0.26249027950824505]
We conduct a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice.
Results indicate that the models with the highest disaggregate predictive performance provide poorer estimates of behavioural indicators and aggregate mode shares.
It is also observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.
arXiv Detail & Related papers (2023-01-11T11:10:32Z) - Stability of clinical prediction models developed using statistical or
machine learning methods [0.5482532589225552]
Clinical prediction models estimate an individual's risk of a particular health outcome, conditional on their values of multiple predictors.
Many models are developed using small datasets that lead to instability in the model and its predictions (estimated risks)
We show instability in a model's estimated risks is often considerable, and manifests itself as miscalibration of predictions in new data.
arXiv Detail & Related papers (2022-11-02T11:55:28Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Black-box Adversarial Attacks on Network-wide Multi-step Traffic State
Prediction Models [4.353029347463806]
We propose an adversarial attack framework by treating the prediction model as a black-box.
The adversary can oracle the prediction model with any input and obtain corresponding output.
To test the attack effectiveness, two state of the art, graph neural network-based models (GCGRNN and DCRNN) are examined.
arXiv Detail & Related papers (2021-10-17T03:45:35Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - A comprehensive study on the prediction reliability of graph neural
networks for virtual screening [0.0]
We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results.
Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate.
arXiv Detail & Related papers (2020-03-17T10:13:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.