Evaluating the Stability of Semantic Concept Representations in CNNs for
Robust Explainability
- URL: http://arxiv.org/abs/2304.14864v1
- Date: Fri, 28 Apr 2023 14:14:00 GMT
- Title: Evaluating the Stability of Semantic Concept Representations in CNNs for
Robust Explainability
- Authors: Georgii Mikriukov, Gesina Schwalbe, Christian Hellert and Korinna Bade
- Abstract summary: This paper focuses on two stability goals when working with concept representations in computer vision CNNs.
The guiding use-case is a post-hoc explainability framework for object detection CNNs.
We propose a novel metric that considers both concept separation and consistency, and is to layer and concept representation dimensionality.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Analysis of how semantic concepts are represented within Convolutional Neural
Networks (CNNs) is a widely used approach in Explainable Artificial
Intelligence (XAI) for interpreting CNNs. A motivation is the need for
transparency in safety-critical AI-based systems, as mandated in various
domains like automated driving. However, to use the concept representations for
safety-relevant purposes, like inspection or error retrieval, these must be of
high quality and, in particular, stable. This paper focuses on two stability
goals when working with concept representations in computer vision CNNs:
stability of concept retrieval and of concept attribution. The guiding use-case
is a post-hoc explainability framework for object detection (OD) CNNs, towards
which existing concept analysis (CA) methods are successfully adapted. To
address concept retrieval stability, we propose a novel metric that considers
both concept separation and consistency, and is agnostic to layer and concept
representation dimensionality. We then investigate impacts of concept
abstraction level, number of concept training samples, CNN size, and concept
representation dimensionality on stability. For concept attribution stability
we explore the effect of gradient instability on gradient-based explainability
methods. The results on various CNNs for classification and object detection
yield the main findings that (1) the stability of concept retrieval can be
enhanced through dimensionality reduction via data aggregation, and (2) in
shallow layers where gradient instability is more pronounced, gradient
smoothing techniques are advised. Finally, our approach provides valuable
insights into selecting the appropriate layer and concept representation
dimensionality, paving the way towards CA in safety-critical XAI applications.
Related papers
- Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete.
We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z) - Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models [57.86303579812877]
Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions.
Existing approaches often require numerous human interventions per image to achieve strong performances.
We introduce a trainable concept realignment intervention module, which leverages concept relations to realign concept assignments post-intervention.
arXiv Detail & Related papers (2024-05-02T17:59:01Z) - Implicit Concept Removal of Diffusion Models [92.55152501707995]
Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images.
We present the Geom-Erasing, a novel concept removal method based on the geometric-driven control.
arXiv Detail & Related papers (2023-10-09T17:13:10Z) - Scale-Preserving Automatic Concept Extraction (SPACE) [5.270054840298395]
We introduce the Scale-Preserving Automatic Concept Extraction (SPACE) algorithm, as a state-of-the-art alternative concept extraction technique for CNNs.
Our method provides explanations of the models' decision-making process in the form of human-understandable concepts.
arXiv Detail & Related papers (2023-08-11T08:54:45Z) - Understanding and Enhancing Robustness of Concept-based Models [41.20004311158688]
We study robustness of concept-based models to adversarial perturbations.
In this paper, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models.
We then propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks.
arXiv Detail & Related papers (2022-11-29T10:43:51Z) - I saw, I conceived, I concluded: Progressive Concepts as Bottlenecks [2.9398911304923447]
Concept bottleneck models (CBMs) provide explainability and intervention during inference by correcting predicted, intermediate concepts.
This makes CBMs attractive for high-stakes decision-making.
We take the quality assessment of fetal ultrasound scans as a real-life use case for CBM decision support in healthcare.
arXiv Detail & Related papers (2022-11-19T09:31:19Z) - Interpretable Self-Aware Neural Networks for Robust Trajectory
Prediction [50.79827516897913]
We introduce an interpretable paradigm for trajectory prediction that distributes the uncertainty among semantic concepts.
We validate our approach on real-world autonomous driving data, demonstrating superior performance over state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-16T06:28:20Z) - GlanceNets: Interpretabile, Leak-proof Concept-based Models [23.7625973884849]
Concept-based models (CBMs) combine high-performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts.
We provide a clear definition of interpretability in terms of alignment between the model's representation and an underlying data generation process.
We introduce GlanceNets, a new CBM that exploits techniques from disentangled representation learning and open-set recognition to achieve alignment.
arXiv Detail & Related papers (2022-05-31T08:53:53Z) - Modeling Temporal Concept Receptive Field Dynamically for Untrimmed
Video Analysis [105.06166692486674]
We study temporal concept receptive field of concept-based event representation.
We introduce temporal dynamic convolution (TDC) to give stronger flexibility to concept-based event analytics.
Different coefficients can generate appropriate and accurate temporal concept receptive field size according to input videos.
arXiv Detail & Related papers (2021-11-23T04:59:48Z) - Invertible Concept-based Explanations for CNN Models with Non-negative
Concept Activation Vectors [24.581839689833572]
Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form.
Recent work on explanations through feature importance of approximate linear models has moved from input-level features to features from mid-layer feature maps in the form of concept activation vectors (CAVs)
In this work, we rethink the ACE algorithm of Ghorbani etal., proposing an alternative invertible concept-based explanation (ICE) framework to overcome its shortcomings.
arXiv Detail & Related papers (2020-06-27T17:57:26Z) - A general framework for defining and optimizing robustness [74.67016173858497]
We propose a rigorous and flexible framework for defining different types of robustness properties for classifiers.
Our concept is based on postulates that robustness of a classifier should be considered as a property that is independent of accuracy.
We develop a very general robustness framework that is applicable to any type of classification model.
arXiv Detail & Related papers (2020-06-19T13:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.