Related papers: One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability

One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability

URL: http://arxiv.org/abs/2410.01482v1
Date: Wed, 2 Oct 2024 12:34:04 GMT
Title: One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability
Authors: Gabriel Kasmi, Amandine Brunetto, Thomas Fel, Jayneel Parekh,
Abstract summary: We propose leveraging the wavelet domain as a robust mathematical foundation for attribution. Our approach extends the existing gradient-based feature attributions into the wavelet domain. We show how our method explains not only the where -- the important parts of the input -- but also the what.
Score: 6.151633954305939
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the growing use of deep neural networks in safety-critical decision-making, their inherent black-box nature hinders transparency and interpretability. Explainable AI (XAI) methods have thus emerged to understand a model's internal workings, and notably attribution methods also called saliency maps. Conventional attribution methods typically identify the locations -- the where -- of significant regions within an input. However, because they overlook the inherent structure of the input data, these methods often fail to interpret what these regions represent in terms of structural components (e.g., textures in images or transients in sounds). Furthermore, existing methods are usually tailored to a single data modality, limiting their generalizability. In this paper, we propose leveraging the wavelet domain as a robust mathematical foundation for attribution. Our approach, the Wavelet Attribution Method (WAM) extends the existing gradient-based feature attributions into the wavelet domain, providing a unified framework for explaining classifiers across images, audio, and 3D shapes. Empirical evaluations demonstrate that WAM matches or surpasses state-of-the-art methods across faithfulness metrics and models in image, audio, and 3D explainability. Finally, we show how our method explains not only the where -- the important parts of the input -- but also the what -- the relevant patterns in terms of structural components.

Related papers

Noise-Resilient Unsupervised Graph Representation Learning via Multi-Hop Feature Quality Estimation [53.91958614666386]
Unsupervised graph representation learning (UGRL) based on graph neural networks (GNNs) We propose a novel UGRL method based on Multi-hop feature Quality Estimation (MQE)
arXiv Detail & Related papers (2024-07-29T12:24:28Z)
Diffusion Features to Bridge Domain Gap for Semantic Segmentation [2.8616666231199424]
This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently. By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it.
arXiv Detail & Related papers (2024-06-02T15:33:46Z)
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models [52.3015009878545]
We develop an image segmentor capable of generating fine-grained segmentation maps without any additional training. Our framework identifies semantic correspondences between image pixels and spatial locations of low-dimensional feature maps. In extensive experiments, the produced segmentation maps are demonstrated to be well delineated and capture detailed parts of the images.
arXiv Detail & Related papers (2024-01-22T07:34:06Z)
DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment [124.57488600605822]
Cross-modal garment synthesis and manipulation will significantly benefit the way fashion designers generate garments. We introduce DiffCloth, a diffusion-based pipeline for cross-modal garment synthesis and manipulation. Experiments on the CM-Fashion benchmark demonstrate that DiffCloth both yields state-of-the-art garment synthesis results.
arXiv Detail & Related papers (2023-08-22T05:43:33Z)
DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications [54.93807822347193]
We show how to adapt attribution robustness estimation methods to a given domain, so as to take into account domain-specific plausibility. Next, we provide two methods, adversarial training and FAR training, to mitigate the brittleness characterized by DARE. Finally, we empirically validate our methods with extensive experiments on three established biomedical benchmarks.
arXiv Detail & Related papers (2023-07-05T08:11:40Z)
Assessment of the Reliablity of a Model's Decision by Generalizing Attribution to the Wavelet Domain [0.8192907805418583]
We introduce the Wavelet sCale Attribution Method (WCAM), a generalization of attribution from the pixel domain to the space-scale domain using wavelet transforms. Our code is accessible here.
arXiv Detail & Related papers (2023-05-24T10:13:32Z)
XAI-based Comparison of Input Representations for Audio Event Classification [10.874097312428235]
We leverage eXplainable AI (XAI) to understand the underlying classification strategies of models trained on different input representations. Specifically, we compare two model architectures with regard to relevant input features used for Audio Event Detection.
arXiv Detail & Related papers (2023-04-27T08:30:07Z)
PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute Recognition [23.814762073093153]
We propose a pure transformer-based multi-task PAR network named PARFormer, which includes four modules. In the feature extraction module, we build a strong baseline for feature extraction, which achieves competitive results on several PAR benchmarks. In the viewpoint perception module, we explore the impact of viewpoints on pedestrian attributes, and propose a multi-view contrastive loss. In the attribute recognition module, we alleviate the negative-positive imbalance problem to generate the attribute predictions.
arXiv Detail & Related papers (2023-04-14T16:27:56Z)
Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps [85.49020931411825]
Convolutional Neural Networks (CNNs) compression is crucial to deploying these models in edge devices with limited resources. We propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process. We tackle this challenge by introducing a selector model that predicts real-time smooth saliency masks for pruned models.
arXiv Detail & Related papers (2022-09-07T01:12:11Z)
Content-aware Directed Propagation Network with Pixel Adaptive Kernel Attention [20.0783340490331]
We propose a novel operation, called pixel adaptive kernel attention (PAKA) PAKA provides directivity to the filter weights by multiplying spatially varying attention from learnable features. Our method is trainable in an end-to-end manner and applicable to any CNN-based models.
arXiv Detail & Related papers (2021-07-28T02:59:19Z)
Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization [87.96102461221415]
We develop an algorithm that provides per-class explainability. In an extensive battery of experiments, we demonstrate the ability of our methods to class-specific visualization.
arXiv Detail & Related papers (2020-12-03T18:48:39Z)
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity. However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power. In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique. We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z)
Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts. We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively. Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively. Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z)
Saliency-driven Class Impressions for Feature Visualization of Deep Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification. Existing visualization methods develop high confidence images consisting of both background and foreground features. In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z)
Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels. To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit. Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
Attentive WaveBlock: Complementarity-enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB) AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels. Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z)
Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision. Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.