Information-theoretic Evolution of Model Agnostic Global Explanations
- URL: http://arxiv.org/abs/2105.06956v1
- Date: Fri, 14 May 2021 16:52:16 GMT
- Title: Information-theoretic Evolution of Model Agnostic Global Explanations
- Authors: Sukriti Verma, Nikaash Puri, Piyush Gupta, Balaji Krishnamurthy
- Abstract summary: We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data.
Our approach has been deployed in a leading digital marketing suite of products.
- Score: 10.921146104622972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explaining the behavior of black box machine learning models through human
interpretable rules is an important research area. Recent work has focused on
explaining model behavior locally i.e. for specific predictions as well as
globally across the fields of vision, natural language, reinforcement learning
and data science. We present a novel model-agnostic approach that derives rules
to globally explain the behavior of classification models trained on numerical
and/or categorical data. Our approach builds on top of existing local model
explanation methods to extract conditions important for explaining model
behavior for specific instances followed by an evolutionary algorithm that
optimizes an information theory based fitness function to construct rules that
explain global model behavior. We show how our approach outperforms existing
approaches on a variety of datasets. Further, we introduce a parameter to
evaluate the quality of interpretation under the scenario of distributional
shift. This parameter evaluates how well the interpretation can predict model
behavior for previously unseen data distributions. We show how existing
approaches for interpreting models globally lack distributional robustness.
Finally, we show how the quality of the interpretation can be improved under
the scenario of distributional shift by adding out of distribution samples to
the dataset used to learn the interpretation and thereby, increase robustness.
All of the datasets used in our paper are open and publicly available. Our
approach has been deployed in a leading digital marketing suite of products.
Related papers
- The Distributional Hypothesis Does Not Fully Explain the Benefits of
Masked Language Model Pretraining [27.144616560712493]
We investigate whether better sample efficiency and the better generalization capability of models pretrained with masked language modeling can be attributed to the semantic similarity encoded in the pretraining data's distributional property.
Our results illustrate our limited understanding of model pretraining and provide future research directions.
arXiv Detail & Related papers (2023-10-25T00:31:29Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Globally Interpretable Graph Learning via Distribution Matching [12.885580925389352]
We aim to answer an important question that is not yet well studied: how to provide a global interpretation for the graph learning procedure?
We formulate this problem as globally interpretable graph learning, which targets on distilling high-level and human-intelligible patterns that dominate the learning procedure.
We propose a novel model fidelity metric, tailored for evaluating the fidelity of the resulting model trained on interpretations.
arXiv Detail & Related papers (2023-06-18T00:50:36Z) - Are Data-driven Explanations Robust against Out-of-distribution Data? [18.760475318852375]
We propose an end-to-end model-agnostic learning framework Distributionally Robust Explanations (DRE)
Key idea is to fully utilize the inter-distribution information to provide supervisory signals for the learning of explanations without human annotation.
Our results demonstrate that the proposed method significantly improves the model's performance in terms of explanation and prediction robustness against distributional shifts.
arXiv Detail & Related papers (2023-03-29T02:02:08Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - A Topological-Framework to Improve Analysis of Machine Learning Model
Performance [5.3893373617126565]
We propose a framework for evaluating machine learning models in which a dataset is treated as a "space" on which a model operates.
We describe a topological data structure, presheaves, which offer a convenient way to store and analyze model performance between different subpopulations.
arXiv Detail & Related papers (2021-07-09T23:11:13Z) - An Information-theoretic Approach to Distribution Shifts [9.475039534437332]
Safely deploying machine learning models to the real world is often a challenging process.
Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere.
neural networks that are fit to a subset of the population might carry some selection bias into their decision process.
arXiv Detail & Related papers (2021-06-07T16:44:21Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.