Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations
- URL: http://arxiv.org/abs/2406.05068v1
- Date: Fri, 7 Jun 2024 16:37:50 GMT
- Title: Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations
- Authors: Benjamin Fresz, Lena Lörcher, Marco Huber,
- Abstract summary: Saliency methods provide (super-)pixelwise feature attribution scores for input images.
New evaluation metrics for saliency methods are developed and common saliency methods are benchmarked on ImageNet.
A scheme for reliability evaluation of such metrics is proposed that is based on concepts from psychometric testing.
- Score: 0.24578723416255752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision processes of computer vision models - especially deep neural networks - are opaque in nature, meaning that these decisions cannot be understood by humans. Thus, over the last years, many methods to provide human-understandable explanations have been proposed. For image classification, the most common group are saliency methods, which provide (super-)pixelwise feature attribution scores for input images. But their evaluation still poses a problem, as their results cannot be simply compared to the unknown ground truth. To overcome this, a slew of different proxy metrics have been defined, which are - as the explainability methods themselves - often built on intuition and thus, are possibly unreliable. In this paper, new evaluation metrics for saliency methods are developed and common saliency methods are benchmarked on ImageNet. In addition, a scheme for reliability evaluation of such metrics is proposed that is based on concepts from psychometric testing. The used code can be found at https://github.com/lelo204/ClassificationMetricsForImageExplanations .
Related papers
- Introspective Deep Metric Learning [91.47907685364036]
We propose an introspective deep metric learning framework for uncertainty-aware comparisons of images.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling.
arXiv Detail & Related papers (2023-09-11T16:21:13Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - On The Coherence of Quantitative Evaluation of Visual Explanations [0.7212939068975619]
Evaluation methods have been proposed to assess the "goodness" of visual explanations.
We study a subset of the ImageNet-1k validation set where we evaluate a number of different commonly-used explanation methods.
Results of our study suggest that there is a lack of coherency on the grading provided by some of the considered evaluation methods.
arXiv Detail & Related papers (2023-02-14T13:41:57Z) - Evaluation of FEM and MLFEM AI-explainers in Image Classification tasks
with reference-based and no-reference metrics [0.0]
We remind recently proposed post-hoc explainers FEM and MLFEM which have been designed for explanations of CNNs in image and video classification tasks.
We propose their evaluation with reference-based and no-reference metrics.
As a no-reference metric we use "stability" metric, proposed by Alvarez-Melis and Jaakkola.
arXiv Detail & Related papers (2022-12-02T14:55:31Z) - Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z) - Introspective Deep Metric Learning for Image Retrieval [80.29866561553483]
We argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training.
We propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets.
arXiv Detail & Related papers (2022-05-09T17:51:44Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Towards Interpretable Deep Metric Learning with Structural Matching [86.16700459215383]
We present a deep interpretable metric learning (DIML) method for more transparent embedding learning.
Our method is model-agnostic, which can be applied to off-the-shelf backbone networks and metric learning methods.
We evaluate our method on three major benchmarks of deep metric learning including CUB200-2011, Cars196, and Stanford Online Products.
arXiv Detail & Related papers (2021-08-12T17:59:09Z) - Evaluation of Saliency-based Explainability Method [2.733700237741334]
A class of Explainable AI (XAI) methods provide saliency maps to highlight part of the image a CNN model looks at to classify the image as a way to explain its working.
These methods provide an intuitive way for users to understand predictions made by CNNs.
Other than quantitative computational tests, the vast majority of evidence to highlight that the methods are valuable is anecdotal.
arXiv Detail & Related papers (2021-06-24T05:40:50Z) - Combining Similarity and Adversarial Learning to Generate Visual
Explanation: Application to Medical Image Classification [0.0]
We leverage a learning framework to produce our visual explanations method.
Using metrics from the literature, our method outperforms state-of-the-art approaches.
We validate our approach on a large chest X-ray database.
arXiv Detail & Related papers (2020-12-14T08:34:12Z) - Assessing the Reliability of Visual Explanations of Deep Models with
Adversarial Perturbations [15.067369314723958]
We propose an objective measure to evaluate the reliability of explanations of deep models.
Our approach is based on changes in the network's outcome resulting from the perturbation of input images in an adversarial way.
We also propose a straightforward application of our approach to clean relevance maps, creating more interpretable maps without any loss in essential explanation.
arXiv Detail & Related papers (2020-04-22T19:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.