REVEL Framework to measure Local Linear Explanations for black-box
models: Deep Learning Image Classification case of study
- URL: http://arxiv.org/abs/2211.06154v1
- Date: Fri, 11 Nov 2022 12:15:36 GMT
- Title: REVEL Framework to measure Local Linear Explanations for black-box
models: Deep Learning Image Classification case of study
- Authors: Iv\'an Sevillano-Garc\'ia, Juli\'an Luengo-Mart\'in and Francisco
Herrera
- Abstract summary: We propose a procedure called REVEL to evaluate different aspects concerning the quality of explanations with a theoretically coherent development.
The experiments have been carried out on image four datasets as benchmark where we show REVEL's descriptive and analytical power.
- Score: 12.49538398746092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explainable artificial intelligence is proposed to provide explanations for
reasoning performed by an Artificial Intelligence. There is no consensus on how
to evaluate the quality of these explanations, since even the definition of
explanation itself is not clear in the literature. In particular, for the
widely known Local Linear Explanations, there are qualitative proposals for the
evaluation of explanations, although they suffer from theoretical
inconsistencies. The case of image is even more problematic, where a visual
explanation seems to explain a decision while detecting edges is what it really
does. There are a large number of metrics in the literature specialized in
quantitatively measuring different qualitative aspects so we should be able to
develop metrics capable of measuring in a robust and correct way the desirable
aspects of the explanations. In this paper, we propose a procedure called REVEL
to evaluate different aspects concerning the quality of explanations with a
theoretically coherent development. This procedure has several advances in the
state of the art: it standardizes the concepts of explanation and develops a
series of metrics not only to be able to compare between them but also to
obtain absolute information regarding the explanation itself. The experiments
have been carried out on image four datasets as benchmark where we show REVEL's
descriptive and analytical power.
Related papers
- Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - XAI Benchmark for Visual Explanation [15.687509357300847]
We develop a benchmark for visual explanation, consisting of eight datasets with human explanation annotations.
We devise a visual explanation pipeline that includes data loading, explanation generation, and method evaluation.
Our proposed benchmarks facilitate a fair evaluation and comparison of visual explanation methods.
arXiv Detail & Related papers (2023-10-12T17:26:16Z) - Measuring Information in Text Explanations [23.929076318334047]
We argue that placing the explanations on an information-theoretic framework could unify the evaluations of two popular text explanation methods.
We quantify the information flow through these channels, thereby facilitating the assessment of explanation characteristics.
Our work contributes to the ongoing efforts in establishing rigorous and standardized evaluation criteria in the rapidly evolving field of explainable AI.
arXiv Detail & Related papers (2023-10-06T19:46:51Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - Detection Accuracy for Evaluating Compositional Explanations of Units [5.220940151628734]
Two examples of methods that use this approach are Network Dissection and Compositional explanations.
While intuitively, logical forms are more informative than atomic concepts, it is not clear how to quantify this improvement.
We propose to use as evaluation metric the Detection Accuracy, which measures units' consistency of detection of their assigned explanations.
arXiv Detail & Related papers (2021-09-16T08:47:34Z) - To trust or not to trust an explanation: using LEAF to evaluate local
linear XAI methods [0.0]
There is no consensus on how to quantitatively evaluate explanations in practice.
explanations are typically used only to inspect black-box models, and the proactive use of explanations as a decision support is generally overlooked.
Among the many approaches to XAI, a widely adopted paradigm is Local Linear Explanations - with LIME and SHAP emerging as state-of-the-art methods.
We show that these methods are plagued by many defects including unstable explanations, divergence of actual implementations from the promised theoretical properties, and explanations for the wrong label.
This highlights the need to have standard and unbiased evaluation procedures for
arXiv Detail & Related papers (2021-06-01T13:14:12Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Benchmarking and Survey of Explanation Methods for Black Box Models [9.747543620322956]
We provide a categorization of explanation methods based on the type of explanation returned.
We present the most recent and widely used explainers, and we show a visual comparison among explanations and a quantitative benchmarking.
arXiv Detail & Related papers (2021-02-25T18:50:29Z) - Evaluating Explanations: How much do explanations from the teacher aid
students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning.
Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.