Related papers: Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions

Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions

URL: http://arxiv.org/abs/2206.12481v3
Date: Tue, 16 Apr 2024 16:27:15 GMT
Title: Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions
Authors: Zulqarnain Khan, Davin Hill, Aria Masoomi, Joshua Bone, Jennifer Dy,
Abstract summary: We focus on one particular aspect of robustness, namely that an explainer should give similar explanations for similar data inputs. We formalize this notion by introducing and defining explainer astuteness, analogous to astuteness of prediction functions.
Score: 3.5199856477763722
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning methods have significantly improved in their predictive capabilities, but at the same time they are becoming more complex and less transparent. As a result, explainers are often relied on to provide interpretability to these black-box prediction models. As crucial diagnostics tools, it is important that these explainers themselves are robust. In this paper we focus on one particular aspect of robustness, namely that an explainer should give similar explanations for similar data inputs. We formalize this notion by introducing and defining explainer astuteness, analogous to astuteness of prediction functions. Our formalism allows us to connect explainer robustness to the predictor's probabilistic Lipschitzness, which captures the probability of local smoothness of a function. We provide lower bound guarantees on the astuteness of a variety of explainers (e.g., SHAP, RISE, CXPlain) given the Lipschitzness of the prediction function. These theoretical results imply that locally smooth prediction functions lend themselves to locally robust explanations. We evaluate these results empirically on simulated as well as real datasets.

Related papers

Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors [61.92704516732144]
We show that the most robust features for correctness prediction are those that play a distinctive causal role in the model's behavior.<n>We propose two methods that leverage causal mechanisms to predict the correctness of model outputs.
arXiv Detail & Related papers (2025-05-17T00:31:39Z)
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [76.15163242945813]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence.<n>We introduce a novel generative model that generates tokens on the basis of human-interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data. We determine the types of distribution shifts that do contribute to the identifiability of causal representations. We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z)
Uncertainty-Aware Explanations Through Probabilistic Self-Explainable Neural Networks [17.238290206236027]
Prob-PSENNs replace point estimates for prototypes with probability distributions over their values. Prob-PSENNs provide more meaningful and robust explanations than their non-probabilistic counterparts.
arXiv Detail & Related papers (2024-03-20T16:47:28Z)
LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations [1.0370398945228227]
We introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for machine learning models. The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features. Our approach offers causal explanations and outperforms LIME and SHAP in terms of local accuracy and consistency of explained features.
arXiv Detail & Related papers (2023-10-01T04:09:59Z)
Performative Prediction with Neural Networks [24.880495520422]
performative prediction is a framework for learning models that influence the data they intend to predict. Standard convergence results for finding a performatively stable classifier with the method of repeated risk minimization assume that the data distribution is Lipschitz continuous to the model's parameters. In this work, we instead assume that the data distribution is Lipschitz continuous with respect to the model's predictions, a more natural assumption for performative systems.
arXiv Detail & Related papers (2023-04-14T01:12:48Z)
TsSHAP: Robust model agnostic feature-based explainability for time series forecasting [6.004928390125367]
We propose a feature-based explainability algorithm, TsSHAP, that can explain the forecast of any black-box forecasting model. We formalize the notion of local, semi-local, and global explanations in the context of time series forecasting.
arXiv Detail & Related papers (2023-03-22T05:14:36Z)
Prediction-Powered Inference [68.97619568620709]
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system. The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients. Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning.
arXiv Detail & Related papers (2023-01-23T18:59:28Z)
SurvSHAP(t): Time-dependent explanations of machine learning survival models [6.950862982117125]
We introduce SurvSHAP(t), the first time-dependent explanation that allows for interpreting survival black-box models. Experiments on synthetic and medical data confirm that SurvSHAP(t) can detect variables with a time-dependent effect. We provide an accessible implementation of time-dependent explanations in Python.
arXiv Detail & Related papers (2022-08-23T17:01:14Z)
Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles [50.81061839052459]
We formalize the generation of robust counterfactual explanations as a probabilistic problem. We show the link between the robustness of ensemble models and the robustness of base learners. Our method achieves high robustness with only a small increase in the distance from counterfactual explanations to their initial observations.
arXiv Detail & Related papers (2022-05-27T17:28:54Z)
Explainability in Process Outcome Prediction: Guidelines to Obtain Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction. This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z)
Probabilistic Sufficient Explanations [31.08715352013011]
We introduce probabilistic sufficient explanations, which formulate explaining an instance of classification as choosing the "simplest" subset of features. We design a scalable algorithm for finding the desired explanations while keeping the guarantees intact. Our experiments demonstrate the effectiveness of our algorithm in finding sufficient explanations, and showcase its advantages compared to Anchors and logical explanations.
arXiv Detail & Related papers (2021-05-21T04:03:10Z)
The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models. We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis. We obtain new explanations that are loosely necessary and sufficient for a prediction. We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.