CRAFT: Concept Recursive Activation FacTorization for Explainability
- URL: http://arxiv.org/abs/2211.10154v2
- Date: Tue, 28 Mar 2023 19:44:38 GMT
- Title: CRAFT: Concept Recursive Activation FacTorization for Explainability
- Authors: Thomas Fel, Agustin Picard, Louis Bethune, Thibaut Boissin, David
Vigouroux, Julien Colin, R\'emi Cad\`ene, Thomas Serre
- Abstract summary: CRAFT is a novel approach to identify both "what" and "where" by generating concept-based explanations.
We conduct both human and computer vision experiments to demonstrate the benefits of the proposed approach.
- Score: 5.306341151551106
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Attribution methods, which employ heatmaps to identify the most influential
regions of an image that impact model decisions, have gained widespread
popularity as a type of explainability method. However, recent research has
exposed the limited practical value of these methods, attributed in part to
their narrow focus on the most prominent regions of an image -- revealing
"where" the model looks, but failing to elucidate "what" the model sees in
those areas. In this work, we try to fill in this gap with CRAFT -- a novel
approach to identify both "what" and "where" by generating concept-based
explanations. We introduce 3 new ingredients to the automatic concept
extraction literature: (i) a recursive strategy to detect and decompose
concepts across layers, (ii) a novel method for a more faithful estimation of
concept importance using Sobol indices, and (iii) the use of implicit
differentiation to unlock Concept Attribution Maps.
We conduct both human and computer vision experiments to demonstrate the
benefits of the proposed approach. We show that the proposed concept importance
estimation technique is more faithful to the model than previous methods. When
evaluating the usefulness of the method for human experimenters on a
human-centered utility benchmark, we find that our approach significantly
improves on two of the three test scenarios. Our code is freely available at
github.com/deel-ai/Craft.
Related papers
- Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - A survey on Concept-based Approaches For Model Improvement [2.1516043775965565]
Concepts are known to be the thinking ground of humans.
We provide a systematic review and taxonomy of various concept representations and their discovery algorithms in Deep Neural Networks (DNNs)
We also provide details on concept-based model improvement literature marking the first comprehensive survey of these methods.
arXiv Detail & Related papers (2024-03-21T17:09:20Z) - An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity.
We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z) - Concept backpropagation: An Explainable AI approach for visualising
learned concepts in neural network models [0.0]
We present an extension to the method of concept detection, named emphconcept backpropagation, which provides a way of analysing how the information representing a given concept is internalised in a given neural network model.
arXiv Detail & Related papers (2023-07-24T08:21:13Z) - A Holistic Approach to Unifying Automatic Concept Extraction and Concept
Importance Estimation [18.600321051705482]
Concept-based approaches have emerged as some of the most promising explainability methods.
We introduce a unifying theoretical framework that comprehensively defines and clarifies these two steps.
We show how to efficiently identify clusters of data points that are classified based on a similar shared strategy.
arXiv Detail & Related papers (2023-06-11T23:28:02Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z) - Sparse Subspace Clustering for Concept Discovery (SSCCD) [1.7319807100654885]
Concepts are key building blocks of higher level human understanding.
Local attribution methods do not allow to identify coherent model behavior across samples.
We put forward a new definition of concepts as low-dimensional subspaces of hidden feature layers.
arXiv Detail & Related papers (2022-03-11T16:15:48Z) - Discovering Concepts in Learned Representations using Statistical
Inference and Interactive Visualization [0.76146285961466]
Concept discovery is important for bridging the gap between non-deep learning experts and model end-users.
Current approaches include hand-crafting concept datasets and then converting them to latent space directions.
In this study, we offer another two approaches to guide user discovery of meaningful concepts, one based on multiple hypothesis testing, and another on interactive visualization.
arXiv Detail & Related papers (2022-02-09T22:29:48Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient.
We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.