Best of both worlds: local and global explanations with
human-understandable concepts
- URL: http://arxiv.org/abs/2106.08641v1
- Date: Wed, 16 Jun 2021 09:05:25 GMT
- Title: Best of both worlds: local and global explanations with
human-understandable concepts
- Authors: Jessica Schrouff, Sebastien Baur, Shaobo Hou, Diana Mincu, Eric
Loreaux, Ralph Blanes, James Wexler, Alan Karthikesalingam, Been Kim
- Abstract summary: Interpretability techniques aim to provide the rationale behind a model's decision, typically by explaining either an individual prediction or a class of predictions.
We show that our method improves global explanations over TCAV when compared to ground truth, and provides useful insights.
- Score: 10.155485106226754
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interpretability techniques aim to provide the rationale behind a model's
decision, typically by explaining either an individual prediction (local
explanation, e.g. `why is this patient diagnosed with this condition') or a
class of predictions (global explanation, e.g. `why are patients diagnosed with
this condition in general'). While there are many methods focused on either
one, few frameworks can provide both local and global explanations in a
consistent manner. In this work, we combine two powerful existing techniques,
one local (Integrated Gradients, IG) and one global (Testing with Concept
Activation Vectors), to provide local, and global concept-based explanations.
We first validate our idea using two synthetic datasets with a known ground
truth, and further demonstrate with a benchmark natural image dataset. We test
our method with various concepts, target classes, model architectures and IG
baselines. We show that our method improves global explanations over TCAV when
compared to ground truth, and provides useful insights. We hope our work
provides a step towards building bridges between many existing local and global
methods to get the best of both worlds.
Related papers
- GLEAMS: Bridging the Gap Between Local and Global Explanations [6.329021279685856]
We propose GLEAMS, a novel method that partitions the input space and learns an interpretable model within each sub-region.
We demonstrate GLEAMS' effectiveness on both synthetic and real-world data, highlighting its desirable properties and human-understandable insights.
arXiv Detail & Related papers (2024-08-09T13:30:37Z) - Global Counterfactual Directions [0.0]
We show that the latent space of Diffusion Autoencoders encodes the inference process of a given classifier in the form of global directions.
We propose a novel proxy-based approach that discovers two types of these directions with the use of only single image in an entirely black-box manner.
We show that GCDs can be naturally combined with Latent Integrated Gradients resulting in a new black-box attribution method.
arXiv Detail & Related papers (2024-04-18T20:03:56Z) - GLOBE-CE: A Translation-Based Approach for Global Counterfactual
Explanations [10.276136171459731]
Global & Efficient Counterfactual Explanations (GLOBE-CE) is a flexible framework that tackles the reliability and scalability issues associated with current state-of-the-art.
We provide a unique mathematical analysis of categorical feature translations, utilising it in our method.
Experimental evaluation with publicly available datasets and user studies demonstrate that GLOBE-CE performs significantly better than the current state-of-the-art.
arXiv Detail & Related papers (2023-05-26T15:26:59Z) - Coalescing Global and Local Information for Procedural Text
Understanding [70.10291759879887]
A complete procedural understanding solution should combine three core aspects: local and global views of the inputs, and global view of outputs.
In this paper, we propose Coalescing Global and Local InformationCG, a new model that builds entity and time representations.
Experiments on a popular procedural text understanding dataset show that our model achieves state-of-the-art results.
arXiv Detail & Related papers (2022-08-26T19:16:32Z) - Distillation with Contrast is All You Need for Self-Supervised Point
Cloud Representation Learning [53.90317574898643]
We propose a simple and general framework for self-supervised point cloud representation learning.
Inspired by how human beings understand the world, we utilize knowledge distillation to learn both global shape information and the relationship between global shape and local structures.
Our method achieves the state-of-the-art performance on linear classification and multiple other downstream tasks.
arXiv Detail & Related papers (2022-02-09T02:51:59Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z) - Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial
Explanations of Their Behavior in Natural Language? [86.60613602337246]
We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL explanations.
LAS measures how well explanations help an observer predict a model's output, while controlling for how explanations can directly leak the output.
We frame explanation generation as a multi-agent game and optimize explanations for simulatability while penalizing label leakage.
arXiv Detail & Related papers (2020-10-08T16:59:07Z) - Global-Local Bidirectional Reasoning for Unsupervised Representation
Learning of 3D Point Clouds [109.0016923028653]
We learn point cloud representation by bidirectional reasoning between the local structures and the global shape without human supervision.
We show that our unsupervised model surpasses the state-of-the-art supervised methods on both synthetic and real-world 3D object classification datasets.
arXiv Detail & Related papers (2020-03-29T08:26:08Z) - Optimal Local Explainer Aggregation for Interpretable Prediction [12.934180951771596]
Key challenge for decision makers when incorporating black box machine learned models is being able to understand the predictions provided by these models.
One proposed method is training surrogate explainer models which approximate the more complex model.
We propose a novel local explainer algorithm based on information parameters.
arXiv Detail & Related papers (2020-03-20T19:02:11Z) - Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision.
Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z) - Model Agnostic Multilevel Explanations [31.831973884850147]
We propose a meta-method that, given a typical local explainability method, can build a multilevel explanation tree.
The leaves of this tree correspond to the local explanations, the root corresponds to the global explanation, and intermediate levels correspond to explanations for groups data points.
We argue that such a multilevel structure can also be an effective form of communication, where one could obtain few explanations that characterize the entire dataset.
arXiv Detail & Related papers (2020-03-12T20:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.