Related papers: High Dimensional Model Explanations: an Axiomatic Approach

High Dimensional Model Explanations: an Axiomatic Approach

URL: http://arxiv.org/abs/2006.08969v2
Date: Mon, 29 Mar 2021 07:16:52 GMT
Title: High Dimensional Model Explanations: an Axiomatic Approach
Authors: Neel Patel, Martin Strobel, Yair Zick
Abstract summary: Complex black-box machine learning models are regularly used in critical decision-making domains. We propose a novel high dimension model explanation method that captures the joint effect of feature subsets.
Score: 14.908684655206494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Complex black-box machine learning models are regularly used in critical decision-making domains. This has given rise to several calls for algorithmic explainability. Many explanation algorithms proposed in literature assign importance to each feature individually. However, such explanations fail to capture the joint effects of sets of features. Indeed, few works so far formally analyze high-dimensional model explanations. In this paper, we propose a novel high dimension model explanation method that captures the joint effect of feature subsets. We propose a new axiomatization for a generalization of the Banzhaf index; our method can also be thought of as an approximation of a black-box model by a higher-order polynomial. In other words, this work justifies the use of the generalized Banzhaf index as a model explanation by showing that it uniquely satisfies a set of natural desiderata and that it is the optimal local approximation of a black-box model. Our empirical evaluation of our measure highlights how it manages to capture desirable behavior, whereas other measures that do not satisfy our axioms behave in an unpredictable manner.

Related papers

An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity. We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z)
Explanations of Black-Box Models based on Directional Feature Interactions [8.25114410474287]
We show how to explain black-box models by capturing feature interactions in a directed graph. We show the superiority of our method against state-of-the-art on IMDB10, Census, Divorce, Drug, and gene data.
arXiv Detail & Related papers (2023-04-16T02:00:25Z)
Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z)
Explanations Based on Item Response Theory (eXirt): A Model-Specific Method to Explain Tree-Ensemble Model in Trust Perspective [0.4749981032986242]
Methods such as Ciu, Dalex, Eli5, Lofo, Shap and Skater emerged to explain black box models. Xirt is able to generate global explanations of tree-ensemble models and also local explanations of instances of models through IRT.
arXiv Detail & Related papers (2022-10-18T15:30:14Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Benchmarking and Survey of Explanation Methods for Black Box Models [9.747543620322956]
We provide a categorization of explanation methods based on the type of explanation returned. We present the most recent and widely used explainers, and we show a visual comparison among explanations and a quantitative benchmarking.
arXiv Detail & Related papers (2021-02-25T18:50:29Z)
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model. We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z)
Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis. We obtain new explanations that are loosely necessary and sufficient for a prediction. We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)
Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data [0.0]
Rule-extraction techniques have been proposed to combine the desired predictive accuracy of complex "black-box" models with global explainability. We develop and test a rule-extraction methodology based on higher-level, less-sparse metafeatures. A key finding of our analysis is that metafeatures-based explanations are better at mimicking the behavior of the black-box prediction model.
arXiv Detail & Related papers (2020-03-10T15:08:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.