Enforcing Interpretability and its Statistical Impacts: Trade-offs
between Accuracy and Interpretability
- URL: http://arxiv.org/abs/2010.13764v2
- Date: Wed, 28 Oct 2020 17:34:32 GMT
- Title: Enforcing Interpretability and its Statistical Impacts: Trade-offs
between Accuracy and Interpretability
- Authors: Gintare Karolina Dziugaite, Shai Ben-David, Daniel M. Roy
- Abstract summary: There has been no formal study of the statistical cost of interpretability in machine learning.
We model the act of enforcing interpretability as that of performing empirical risk minimization over the set of interpretable hypotheses.
We perform a case analysis, explaining why one may or may not observe a trade-off between accuracy and interpretability when the restriction to interpretable classifiers does or does not come at the cost of some excess statistical risk.
- Score: 30.501012698482423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To date, there has been no formal study of the statistical cost of
interpretability in machine learning. As such, the discourse around potential
trade-offs is often informal and misconceptions abound. In this work, we aim to
initiate a formal study of these trade-offs. A seemingly insurmountable
roadblock is the lack of any agreed upon definition of interpretability.
Instead, we propose a shift in perspective. Rather than attempt to define
interpretability, we propose to model the \emph{act} of \emph{enforcing}
interpretability. As a starting point, we focus on the setting of empirical
risk minimization for binary classification, and view interpretability as a
constraint placed on learning. That is, we assume we are given a subset of
hypothesis that are deemed to be interpretable, possibly depending on the data
distribution and other aspects of the context. We then model the act of
enforcing interpretability as that of performing empirical risk minimization
over the set of interpretable hypotheses. This model allows us to reason about
the statistical implications of enforcing interpretability, using known results
in statistical learning theory. Focusing on accuracy, we perform a case
analysis, explaining why one may or may not observe a trade-off between
accuracy and interpretability when the restriction to interpretable classifiers
does or does not come at the cost of some excess statistical risk. We close
with some worked examples and some open problems, which we hope will spur
further theoretical development around the tradeoffs involved in
interpretability.
Related papers
- Hard to Explain: On the Computational Hardness of In-Distribution Model Interpretation [0.9558392439655016]
The ability to interpret Machine Learning (ML) models is becoming increasingly essential.
Recent work has demonstrated that it is possible to formally assess interpretability by studying the computational complexity of explaining the decisions of various models.
arXiv Detail & Related papers (2024-08-07T17:20:52Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Doubly Robust Counterfactual Classification [1.8907108368038217]
We study counterfactual classification as a new tool for decision-making under hypothetical (contrary to fact) scenarios.
We propose a doubly-robust nonparametric estimator for a general counterfactual classifier.
arXiv Detail & Related papers (2023-01-15T22:04:46Z) - Uncertain Evidence in Probabilistic Models and Stochastic Simulators [80.40110074847527]
We consider the problem of performing Bayesian inference in probabilistic models where observations are accompanied by uncertainty, referred to as uncertain evidence'
We explore how to interpret uncertain evidence, and by extension the importance of proper interpretation as it pertains to inference about latent variables.
We devise concrete guidelines on how to account for uncertain evidence and we provide new insights, particularly regarding consistency.
arXiv Detail & Related papers (2022-10-21T20:32:59Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - Counterfactual Invariance to Spurious Correlations: Why and How to Pass
Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter.
In machine learning, these have a know-it-when-you-see-it character.
We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z) - Measuring Model Fairness under Noisy Covariates: A Theoretical
Perspective [26.704446184314506]
We study the problem of measuring the fairness of a machine learning model under noisy information.
We present a theoretical analysis that aims to characterize weaker conditions under which accurate fairness evaluation is possible.
arXiv Detail & Related papers (2021-05-20T18:36:28Z) - Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for
Post-Hoc Interpretability [54.85658598523915]
We propose to have a concrete definition of interpretation before we could evaluate faithfulness of an interpretation.
We find that although interpretation methods perform differently under a certain evaluation metric, such a difference may not result from interpretation quality or faithfulness.
arXiv Detail & Related papers (2020-09-16T06:38:03Z) - Getting a CLUE: A Method for Explaining Uncertainty Estimates [30.367995696223726]
We propose a novel method for interpreting uncertainty estimates from differentiable probabilistic models.
Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold.
arXiv Detail & Related papers (2020-06-11T21:53:15Z) - The Curse of Performance Instability in Analysis Datasets: Consequences,
Source, and Suggestions [93.62888099134028]
We find that the performance of state-of-the-art models on Natural Language Inference (NLI) and Reading (RC) analysis/stress sets can be highly unstable.
This raises three questions: (1) How will the instability affect the reliability of the conclusions drawn based on these analysis sets?
We give both theoretical explanations and empirical evidence regarding the source of the instability.
arXiv Detail & Related papers (2020-04-28T15:41:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.