Detection Accuracy for Evaluating Compositional Explanations of Units
- URL: http://arxiv.org/abs/2109.07804v1
- Date: Thu, 16 Sep 2021 08:47:34 GMT
- Title: Detection Accuracy for Evaluating Compositional Explanations of Units
- Authors: Sayo M. Makinwa, Biagio La Rosa and Roberto Capobianco
- Abstract summary: Two examples of methods that use this approach are Network Dissection and Compositional explanations.
While intuitively, logical forms are more informative than atomic concepts, it is not clear how to quantify this improvement.
We propose to use as evaluation metric the Detection Accuracy, which measures units' consistency of detection of their assigned explanations.
- Score: 5.220940151628734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent success of deep learning models in solving complex problems and in
different domains has increased interest in understanding what they learn.
Therefore, different approaches have been employed to explain these models, one
of which uses human-understandable concepts as explanations. Two examples of
methods that use this approach are Network Dissection and Compositional
explanations. The former explains units using atomic concepts, while the latter
makes explanations more expressive, replacing atomic concepts with logical
forms. While intuitively, logical forms are more informative than atomic
concepts, it is not clear how to quantify this improvement, and their
evaluation is often based on the same metric that is optimized during the
search-process and on the usage of hyper-parameters to be tuned. In this paper,
we propose to use as evaluation metric the Detection Accuracy, which measures
units' consistency of detection of their assigned explanations. We show that
this metric (1) evaluates explanations of different lengths effectively, (2)
can be used as a stopping criterion for the compositional explanation search,
eliminating the explanation length hyper-parameter, and (3) exposes new
specialized units whose length 1 explanations are the perceptual abstractions
of their longer explanations.
Related papers
- Selective Explanations [14.312717332216073]
A machine learning model is trained to predict feature attribution scores with only one inference.
Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations.
We propose selective explanations, a novel feature attribution method that detects when amortized explainers generate low-quality explanations.
arXiv Detail & Related papers (2024-05-29T23:08:31Z) - Explainability for Machine Learning Models: From Data Adaptability to
User Perception [0.8702432681310401]
This thesis explores the generation of local explanations for already deployed machine learning models.
It aims to identify optimal conditions for producing meaningful explanations considering both data and user requirements.
arXiv Detail & Related papers (2024-02-16T18:44:37Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Evaluating the Robustness of Interpretability Methods through
Explanation Invariance and Equivariance [72.50214227616728]
Interpretability methods are valuable only if their explanations faithfully describe the explained model.
We consider neural networks whose predictions are invariant under a specific symmetry group.
arXiv Detail & Related papers (2023-04-13T17:59:03Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - The Solvability of Interpretability Evaluation Metrics [7.3709604810699085]
Feature attribution methods are often evaluated on metrics such as comprehensiveness and sufficiency.
In this paper, we highlight an intriguing property of these metrics: their solvability.
We present a series of investigations showing that this beam search explainer is generally comparable or favorable to current choices.
arXiv Detail & Related papers (2022-05-18T02:52:03Z) - ExSum: From Local Explanations to Model Understanding [6.23934576145261]
Interpretability methods are developed to understand the working mechanisms of black-box models.
Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them.
We introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding.
arXiv Detail & Related papers (2022-04-30T02:07:20Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Evaluating Explanations: How much do explanations from the teacher aid
students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning.
Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.