This looks more like that: Enhancing Self-Explaining Models by
Prototypical Relevance Propagation
- URL: http://arxiv.org/abs/2108.12204v1
- Date: Fri, 27 Aug 2021 09:55:53 GMT
- Title: This looks more like that: Enhancing Self-Explaining Models by
Prototypical Relevance Propagation
- Authors: Srishti Gautam, Marina M.-C. H\"ohne, Stine Hansen, Robert Jenssen and
Michael Kampffmeyer
- Abstract summary: We present a case study of the self-explaining network, ProtoPNet, in the presence of a spectrum of artifacts.
We introduce a novel method for generating more precise model-aware explanations.
In order to obtain a clean dataset, we propose to use multi-view clustering strategies for segregating the artifact images.
- Score: 17.485732906337507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current machine learning models have shown high efficiency in solving a wide
variety of real-world problems. However, their black box character poses a
major challenge for the understanding and traceability of the underlying
decision-making strategies. As a remedy, many post-hoc explanation and
self-explanatory methods have been developed to interpret the models' behavior.
These methods, in addition, enable the identification of artifacts that can be
learned by the model as class-relevant features. In this work, we provide a
detailed case study of the self-explaining network, ProtoPNet, in the presence
of a spectrum of artifacts. Accordingly, we identify the main drawbacks of
ProtoPNet, especially, its coarse and spatially imprecise explanations. We
address these limitations by introducing Prototypical Relevance Propagation
(PRP), a novel method for generating more precise model-aware explanations.
Furthermore, in order to obtain a clean dataset, we propose to use multi-view
clustering strategies for segregating the artifact images using the PRP
explanations, thereby suppressing the potential artifact learning in the
models.
Related papers
- Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - CNN-based explanation ensembling for dataset, representation and explanations evaluation [1.1060425537315088]
We explore the potential of ensembling explanations generated by deep classification models using convolutional model.
Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior.
arXiv Detail & Related papers (2024-04-16T08:39:29Z) - Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales [3.242050660144211]
Saliency post-hoc explainability methods are important tools for understanding increasingly complex NLP models.
We present a methodology for incorporating rationales, which are text annotations explaining human decisions, into text classification models.
arXiv Detail & Related papers (2024-04-03T22:39:33Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - A Detailed Study of Interpretability of Deep Neural Network based Top
Taggers [3.8541104292281805]
Recent developments in explainable AI (XAI) allow researchers to explore the inner workings of deep neural networks (DNNs)
We explore interpretability of models designed to identify jets coming from top quark decay in high energy proton-proton collisions at the Large Hadron Collider (LHC)
Our studies uncover some major pitfalls of existing XAI methods and illustrate how they can be overcome to obtain consistent and meaningful interpretation of these models.
arXiv Detail & Related papers (2022-10-09T23:02:42Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - MeLIME: Meaningful Local Explanation for Machine Learning Models [2.819725769698229]
We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models.
MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models.
arXiv Detail & Related papers (2020-09-12T16:06:58Z) - Explainable Recommender Systems via Resolving Learning Representations [57.24565012731325]
Explanations could help improve user experience and discover system defects.
We propose a novel explainable recommendation model through improving the transparency of the representation learning process.
arXiv Detail & Related papers (2020-08-21T05:30:48Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.