Comparing Explanation Methods for Traditional Machine Learning Models
Part 1: An Overview of Current Methods and Quantifying Their Disagreement
- URL: http://arxiv.org/abs/2211.08943v1
- Date: Wed, 16 Nov 2022 14:45:16 GMT
- Title: Comparing Explanation Methods for Traditional Machine Learning Models
Part 1: An Overview of Current Methods and Quantifying Their Disagreement
- Authors: Montgomery Flora, Corey Potvin, Amy McGovern, Shawn Handler
- Abstract summary: This study distinguishes explainability from interpretability, local from global explainability, and feature importance versus feature relevance.
We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers to explore these products.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With increasing interest in explaining machine learning (ML) models, the
first part of this two-part study synthesizes recent research on methods for
explaining global and local aspects of ML models. This study distinguishes
explainability from interpretability, local from global explainability, and
feature importance versus feature relevance. We demonstrate and visualize
different explanation methods, how to interpret them, and provide a complete
Python package (scikit-explain) to allow future researchers to explore these
products. We also highlight the frequent disagreement between explanation
methods for feature rankings and feature effects and provide practical advice
for dealing with these disagreements. We used ML models developed for severe
weather prediction and sub-freezing road surface temperature prediction to
generalize the behavior of the different explanation methods. For feature
rankings, there is substantially more agreement on the set of top features
(e.g., on average, two methods agree on 6 of the top 10 features) than on
specific rankings (on average, two methods only agree on the ranks of 2-3
features in the set of top 10 features). On the other hand, two feature effect
curves from different methods are in high agreement as long as the phase space
is well sampled. Finally, a lesser-known method, tree interpreter, was found
comparable to SHAP for feature effects, and with the widespread use of random
forests in geosciences and computational ease of tree interpreter, we recommend
it be explored in future research.
Related papers
- GLEAMS: Bridging the Gap Between Local and Global Explanations [6.329021279685856]
We propose GLEAMS, a novel method that partitions the input space and learns an interpretable model within each sub-region.
We demonstrate GLEAMS' effectiveness on both synthetic and real-world data, highlighting its desirable properties and human-understandable insights.
arXiv Detail & Related papers (2024-08-09T13:30:37Z) - MOUNTAINEER: Topology-Driven Visual Analytics for Comparing Local Explanations [6.835413642522898]
Topological Data Analysis (TDA) can be an effective method in this domain since it can be used to transform attributions into uniform graph representations.
We present a novel topology-driven visual analytics tool, Mountaineer, that allows ML practitioners to interactively analyze and compare these representations.
We show how Mountaineer enabled us to compare black-box ML explanations and discern regions of and causes of disagreements between different explanations.
arXiv Detail & Related papers (2024-06-21T19:28:50Z) - Explainability for Machine Learning Models: From Data Adaptability to
User Perception [0.8702432681310401]
This thesis explores the generation of local explanations for already deployed machine learning models.
It aims to identify optimal conditions for producing meaningful explanations considering both data and user requirements.
arXiv Detail & Related papers (2024-02-16T18:44:37Z) - Comparing Explanation Methods for Traditional Machine Learning Models
Part 2: Quantifying Model Explainability Faithfulness and Improvements with
Dimensionality Reduction [0.0]
"faithfulness" or "fidelity" refer to the correspondence between the assigned feature importance and the contribution of the feature to model performance.
This study is one of the first to quantify the improvement in explainability from limiting correlated features and knowing the relative fidelity of different explainability methods.
arXiv Detail & Related papers (2022-11-18T17:15:59Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function.
Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Towards Unifying Feature Attribution and Counterfactual Explanations:
Different Means to the Same End [17.226134854746267]
We present a method to generate feature attribution explanations from a set of counterfactual examples.
We show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency.
arXiv Detail & Related papers (2020-11-10T05:41:43Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z) - Evaluating Explainable AI: Which Algorithmic Explanations Help Users
Predict Model Behavior? [97.77183117452235]
We carry out human subject tests to isolate the effect of algorithmic explanations on model interpretability.
Clear evidence of method effectiveness is found in very few cases.
Our results provide the first reliable and comprehensive estimates of how explanations influence simulatability.
arXiv Detail & Related papers (2020-05-04T20:35:17Z) - There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient.
We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.