The Definitions of Interpretability and Learning of Interpretable Models
- URL: http://arxiv.org/abs/2105.14171v1
- Date: Sat, 29 May 2021 01:44:12 GMT
- Title: The Definitions of Interpretability and Learning of Interpretable Models
- Authors: Weishen Pan, Changshui Zhang
- Abstract summary: We propose a mathematical definition for the human-interpretable model.
If a prediction model is interpretable by a human recognition system, the prediction model is defined as a completely human-interpretable model.
- Score: 42.22982369082474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As machine learning algorithms getting adopted in an ever-increasing number
of applications, interpretation has emerged as a crucial desideratum. In this
paper, we propose a mathematical definition for the human-interpretable model.
In particular, we define interpretability between two information process
systems. If a prediction model is interpretable by a human recognition system
based on the above interpretability definition, the prediction model is defined
as a completely human-interpretable model. We further design a practical
framework to train a completely human-interpretable model by user interactions.
Experiments on image datasets show the advantages of our proposed model in two
aspects: 1) The completely human-interpretable model can provide an entire
decision-making process that is human-understandable; 2) The completely
human-interpretable model is more robust against adversarial attacks.
Related papers
- Hard to Explain: On the Computational Hardness of In-Distribution Model Interpretation [0.9558392439655016]
The ability to interpret Machine Learning (ML) models is becoming increasingly essential.
Recent work has demonstrated that it is possible to formally assess interpretability by studying the computational complexity of explaining the decisions of various models.
arXiv Detail & Related papers (2024-08-07T17:20:52Z) - Selecting Interpretability Techniques for Healthcare Machine Learning models [69.65384453064829]
In healthcare there is a pursuit for employing interpretable algorithms to assist healthcare professionals in several decision scenarios.
We overview a selection of eight algorithms, both post-hoc and model-based, that can be used for such purposes.
arXiv Detail & Related papers (2024-06-14T17:49:04Z) - On the Lack of Robust Interpretability of Neural Text Classifiers [14.685352584216757]
We assess the robustness of interpretations of neural text classifiers based on pretrained Transformer encoders.
Both tests show surprising deviations from expected behavior, raising questions about the extent of insights that practitioners may draw from interpretations.
arXiv Detail & Related papers (2021-06-08T18:31:02Z) - Model Learning with Personalized Interpretability Estimation (ML-PIE) [2.862606936691229]
High-stakes applications require AI-generated models to be interpretable.
Current algorithms for the synthesis of potentially interpretable models rely on objectives or regularization terms.
We propose an approach for the synthesis of models that are tailored to the user.
arXiv Detail & Related papers (2021-04-13T09:47:48Z) - Interpretable Deep Learning: Interpretations, Interpretability,
Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused.
We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy.
We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z) - Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability.
We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code.
We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z) - To what extent do human explanations of model behavior align with actual
model behavior? [91.67905128825402]
We investigated the extent to which human-generated explanations of models' inference decisions align with how models actually make these decisions.
We defined two alignment metrics that quantify how well natural language human explanations align with model sensitivity to input words.
We find that a model's alignment with human explanations is not predicted by the model's accuracy on NLI.
arXiv Detail & Related papers (2020-12-24T17:40:06Z) - Human-interpretable model explainability on high-dimensional data [8.574682463936007]
We introduce a framework for human-interpretable explainability on high-dimensional data, consisting of two modules.
First, we apply a semantically meaningful latent representation, both to reduce the raw dimensionality of the data, and to ensure its human interpretability.
Second, we adapt the Shapley paradigm for model-agnostic explainability to operate on these latent features. This leads to interpretable model explanations that are both theoretically controlled and computationally tractable.
arXiv Detail & Related papers (2020-10-14T20:06:28Z) - Are Visual Explanations Useful? A Case Study in Model-in-the-Loop
Prediction [49.254162397086006]
We study explanations based on visual saliency in an image-based age prediction task.
We find that presenting model predictions improves human accuracy.
However, explanations of various kinds fail to significantly alter human accuracy or trust in the model.
arXiv Detail & Related papers (2020-07-23T20:39:40Z) - Learning a Formula of Interpretability to Learn Interpretable Formulas [1.7616042687330642]
We show that an ML model of non-objective Proxies of Human Interpretability can be learned from human feedback.
We show this for evolutionary symbolic regression.
Our approach represents an important stepping stone for the design of next-generation interpretable (evolutionary) ML algorithms.
arXiv Detail & Related papers (2020-04-23T13:59:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.