One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability
- URL: http://arxiv.org/abs/2410.01482v1
- Date: Wed, 2 Oct 2024 12:34:04 GMT
- Title: One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability
- Authors: Gabriel Kasmi, Amandine Brunetto, Thomas Fel, Jayneel Parekh,
- Abstract summary: We propose leveraging the wavelet domain as a robust mathematical foundation for attribution.
Our approach extends the existing gradient-based feature attributions into the wavelet domain.
We show how our method explains not only the where -- the important parts of the input -- but also the what.
- Score: 6.151633954305939
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the growing use of deep neural networks in safety-critical decision-making, their inherent black-box nature hinders transparency and interpretability. Explainable AI (XAI) methods have thus emerged to understand a model's internal workings, and notably attribution methods also called saliency maps. Conventional attribution methods typically identify the locations -- the where -- of significant regions within an input. However, because they overlook the inherent structure of the input data, these methods often fail to interpret what these regions represent in terms of structural components (e.g., textures in images or transients in sounds). Furthermore, existing methods are usually tailored to a single data modality, limiting their generalizability. In this paper, we propose leveraging the wavelet domain as a robust mathematical foundation for attribution. Our approach, the Wavelet Attribution Method (WAM) extends the existing gradient-based feature attributions into the wavelet domain, providing a unified framework for explaining classifiers across images, audio, and 3D shapes. Empirical evaluations demonstrate that WAM matches or surpasses state-of-the-art methods across faithfulness metrics and models in image, audio, and 3D explainability. Finally, we show how our method explains not only the where -- the important parts of the input -- but also the what -- the relevant patterns in terms of structural components.
Related papers
- Transforming gradient-based techniques into interpretable methods [3.6763102409647526]
We introduce GAD (Gradient Artificial Distancing) as a supportive framework for gradient-based techniques.
Its primary objective is to accentuate influential regions by establishing distinctions between classes.
Empirical investigations involving occluded images have demonstrated that the identified regions through this methodology indeed play a pivotal role in facilitating class differentiation.
arXiv Detail & Related papers (2024-01-25T09:24:19Z) - EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models [52.3015009878545]
We develop an image segmentor capable of generating fine-grained segmentation maps without any additional training.
Our framework identifies semantic correspondences between image pixels and spatial locations of low-dimensional feature maps.
In extensive experiments, the produced segmentation maps are demonstrated to be well delineated and capture detailed parts of the images.
arXiv Detail & Related papers (2024-01-22T07:34:06Z) - DiffCloth: Diffusion Based Garment Synthesis and Manipulation via
Structural Cross-modal Semantic Alignment [124.57488600605822]
Cross-modal garment synthesis and manipulation will significantly benefit the way fashion designers generate garments.
We introduce DiffCloth, a diffusion-based pipeline for cross-modal garment synthesis and manipulation.
Experiments on the CM-Fashion benchmark demonstrate that DiffCloth both yields state-of-the-art garment synthesis results.
arXiv Detail & Related papers (2023-08-22T05:43:33Z) - DARE: Towards Robust Text Explanations in Biomedical and Healthcare
Applications [54.93807822347193]
We show how to adapt attribution robustness estimation methods to a given domain, so as to take into account domain-specific plausibility.
Next, we provide two methods, adversarial training and FAR training, to mitigate the brittleness characterized by DARE.
Finally, we empirically validate our methods with extensive experiments on three established biomedical benchmarks.
arXiv Detail & Related papers (2023-07-05T08:11:40Z) - Assessment of the Reliablity of a Model's Decision by Generalizing
Attribution to the Wavelet Domain [0.8192907805418583]
We introduce the Wavelet sCale Attribution Method (WCAM), a generalization of attribution from the pixel domain to the space-scale domain using wavelet transforms.
Our code is accessible here.
arXiv Detail & Related papers (2023-05-24T10:13:32Z) - Visualization of Supervised and Self-Supervised Neural Networks via
Attribution Guided Factorization [87.96102461221415]
We develop an algorithm that provides per-class explainability.
In an extensive battery of experiments, we demonstrate the ability of our methods to class-specific visualization.
arXiv Detail & Related papers (2020-12-03T18:48:39Z) - Explaining Convolutional Neural Networks through Attribution-Based Input
Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity.
However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power.
In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique.
We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z) - Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision.
Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.