McXai: Local model-agnostic explanation as two games
- URL: http://arxiv.org/abs/2201.01044v1
- Date: Tue, 4 Jan 2022 09:02:48 GMT
- Title: McXai: Local model-agnostic explanation as two games
- Authors: Yiran Huang, Nicole Schaal, Michael Hefenbrock, Yexu Zhou, Till
Riedel, Likun Fang, Michael Beigl
- Abstract summary: This work introduces a reinforcement learning-based approach called Monte Carlo tree search for eXplainable Artificial Intelligent (McXai) to explain the decisions of any black-box classification model (classifier)
Our experiments show, that the features found by our method are more informative with respect to classifications than those found by classical approaches like LIME and SHAP.
- Score: 5.2229999775211216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To this day, a variety of approaches for providing local interpretability of
black-box machine learning models have been introduced. Unfortunately, all of
these methods suffer from one or more of the following deficiencies: They are
either difficult to understand themselves, they work on a per-feature basis and
ignore the dependencies between features and/or they only focus on those
features asserting the decision made by the model. To address these points,
this work introduces a reinforcement learning-based approach called Monte Carlo
tree search for eXplainable Artificial Intelligent (McXai) to explain the
decisions of any black-box classification model (classifier). Our method
leverages Monte Carlo tree search and models the process of generating
explanations as two games. In one game, the reward is maximized by finding
feature sets that support the decision of the classifier, while in the second
game, finding feature sets leading to alternative decisions maximizes the
reward. The result is a human friendly representation as a tree structure, in
which each node represents a set of features to be studied with smaller
explanations at the top of the tree. Our experiments show, that the features
found by our method are more informative with respect to classifications than
those found by classical approaches like LIME and SHAP. Furthermore, by also
identifying misleading features, our approach is able to guide towards improved
robustness of the black-box model in many situations.
Related papers
- Explaining the Model and Feature Dependencies by Decomposition of the
Shapley Value [3.0655581300025996]
Shapley values have become one of the go-to methods to explain complex models to end-users.
One downside is that they always require outputs of the model when some features are missing.
This however introduces a non-trivial choice: do we condition on the unknown features or not?
We propose a new algorithmic approach to combine both explanations, removing the burden of choice and enhancing the explanatory power of Shapley values.
arXiv Detail & Related papers (2023-06-19T12:20:23Z) - An Interpretable Loan Credit Evaluation Method Based on Rule
Representation Learner [8.08640000394814]
We design an intrinsically interpretable model based on RRL(Rule Representation) for the Lending Club dataset.
During the training, we learned tricks from previous research to effectively train binary weights.
Our model is used to test the correctness of the explanations generated by the post-hoc method.
arXiv Detail & Related papers (2023-04-03T05:55:04Z) - Symbolic Metamodels for Interpreting Black-boxes Using Primitive
Functions [15.727276506140878]
One approach for interpreting black-box machine learning models is to find a global approximation of the model using simple interpretable functions.
In this work, we propose a new method for finding interpretable metamodels.
arXiv Detail & Related papers (2023-02-09T17:30:43Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Reinforcement Explanation Learning [4.852320309766702]
Black-box methods to generate saliency maps are particularly interesting due to the fact that they do not utilize the internals of the model to explain the decision.
We formulate saliency map generation as a sequential search problem and leverage upon Reinforcement Learning (RL) to accumulate evidence from input images.
Experiments on three benchmark datasets demonstrate the superiority of the proposed approach in inference time over state-of-the-arts without hurting the performance.
arXiv Detail & Related papers (2021-11-26T10:20:01Z) - Search Methods for Sufficient, Socially-Aligned Feature Importance
Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time.
We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function.
Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z) - Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep
Neural Network [8.19063619210761]
We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem.
Experimental results show that the proposed method performs favorably against other methods in small-to-medium problem settings.
It shows comparable performance as state-of-the-art in large problem setting.
arXiv Detail & Related papers (2020-05-14T11:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.