Related papers: McXai: Local model-agnostic explanation as two games

McXai: Local model-agnostic explanation as two games

URL: http://arxiv.org/abs/2201.01044v1
Date: Tue, 4 Jan 2022 09:02:48 GMT
Title: McXai: Local model-agnostic explanation as two games
Authors: Yiran Huang, Nicole Schaal, Michael Hefenbrock, Yexu Zhou, Till Riedel, Likun Fang, Michael Beigl
Abstract summary: This work introduces a reinforcement learning-based approach called Monte Carlo tree search for eXplainable Artificial Intelligent (McXai) to explain the decisions of any black-box classification model (classifier) Our experiments show, that the features found by our method are more informative with respect to classifications than those found by classical approaches like LIME and SHAP.
Score: 5.2229999775211216
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To this day, a variety of approaches for providing local interpretability of black-box machine learning models have been introduced. Unfortunately, all of these methods suffer from one or more of the following deficiencies: They are either difficult to understand themselves, they work on a per-feature basis and ignore the dependencies between features and/or they only focus on those features asserting the decision made by the model. To address these points, this work introduces a reinforcement learning-based approach called Monte Carlo tree search for eXplainable Artificial Intelligent (McXai) to explain the decisions of any black-box classification model (classifier). Our method leverages Monte Carlo tree search and models the process of generating explanations as two games. In one game, the reward is maximized by finding feature sets that support the decision of the classifier, while in the second game, finding feature sets leading to alternative decisions maximizes the reward. The result is a human friendly representation as a tree structure, in which each node represents a set of features to be studied with smaller explanations at the top of the tree. Our experiments show, that the features found by our method are more informative with respect to classifications than those found by classical approaches like LIME and SHAP. Furthermore, by also identifying misleading features, our approach is able to guide towards improved robustness of the black-box model in many situations.

Related papers

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models [88.29990536278167]
We introduce SPaR, a self-play framework integrating tree-search self-refinement to yield valid and comparable preference pairs. Our experiments show that a LLaMA3-8B model, trained over three iterations guided by SPaR, surpasses GPT-4-Turbo on the IFEval benchmark without losing general capabilities.
arXiv Detail & Related papers (2024-12-16T09:47:43Z)
Explaining the Model and Feature Dependencies by Decomposition of the Shapley Value [3.0655581300025996]
Shapley values have become one of the go-to methods to explain complex models to end-users. One downside is that they always require outputs of the model when some features are missing. This however introduces a non-trivial choice: do we condition on the unknown features or not? We propose a new algorithmic approach to combine both explanations, removing the burden of choice and enhancing the explanatory power of Shapley values.
arXiv Detail & Related papers (2023-06-19T12:20:23Z)
An Interpretable Loan Credit Evaluation Method Based on Rule Representation Learner [8.08640000394814]
We design an intrinsically interpretable model based on RRL(Rule Representation) for the Lending Club dataset. During the training, we learned tricks from previous research to effectively train binary weights. Our model is used to test the correctness of the explanations generated by the post-hoc method.
arXiv Detail & Related papers (2023-04-03T05:55:04Z)
Symbolic Metamodels for Interpreting Black-boxes Using Primitive Functions [15.727276506140878]
One approach for interpreting black-box machine learning models is to find a global approximation of the model using simple interpretable functions. In this work, we propose a new method for finding interpretable metamodels.
arXiv Detail & Related papers (2023-02-09T17:30:43Z)
Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks. We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z)
Reinforcement Explanation Learning [4.852320309766702]
Black-box methods to generate saliency maps are particularly interesting due to the fact that they do not utilize the internals of the model to explain the decision. We formulate saliency map generation as a sequential search problem and leverage upon Reinforcement Learning (RL) to accumulate evidence from input images. Experiments on three benchmark datasets demonstrate the superiority of the proposed approach in inference time over state-of-the-arts without hurting the performance.
arXiv Detail & Related papers (2021-11-26T10:20:01Z)
Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time. We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z)
Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models. Our method is based on projecting model representation to a latent space. Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models. We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network [8.19063619210761]
We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem. Experimental results show that the proposed method performs favorably against other methods in small-to-medium problem settings. It shows comparable performance as state-of-the-art in large problem setting.
arXiv Detail & Related papers (2020-05-14T11:36:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.