Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI
- URL: http://arxiv.org/abs/2411.15265v1
- Date: Fri, 22 Nov 2024 11:15:14 GMT
- Title: Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI
- Authors: Won Jun Kim, Hyungjin Chung, Jaemin Kim, Sangmin Lee, Byeongsu Sim, Jong Chul Ye,
- Abstract summary: We introduce Derivative-Free Diffusion Manifold-Constrainted Gradients (FreeMCG)
FreeMCG serves as an improved basis for explainability of a given neural network.
We show that our method yields state-of-the-art results while preserving the essential properties expected of XAI tools.
- Score: 59.96044730204345
- License:
- Abstract: Gradient-based methods are a prototypical family of explainability techniques, especially for image-based models. Nonetheless, they have several shortcomings in that they (1) require white-box access to models, (2) are vulnerable to adversarial attacks, and (3) produce attributions that lie off the image manifold, leading to explanations that are not actually faithful to the model and do not align well with human perception. To overcome these challenges, we introduce Derivative-Free Diffusion Manifold-Constrainted Gradients (FreeMCG), a novel method that serves as an improved basis for explainability of a given neural network than the traditional gradient. Specifically, by leveraging ensemble Kalman filters and diffusion models, we derive a derivative-free approximation of the model's gradient projected onto the data manifold, requiring access only to the model's outputs. We demonstrate the effectiveness of FreeMCG by applying it to both counterfactual generation and feature attribution, which have traditionally been treated as distinct tasks. Through comprehensive evaluation on both tasks, counterfactual explanation and feature attribution, we show that our method yields state-of-the-art results while preserving the essential properties expected of XAI tools.
Related papers
- Sub-graph Based Diffusion Model for Link Prediction [43.15741675617231]
Denoising Diffusion Probabilistic Models (DDPMs) represent a contemporary class of generative models with exceptional qualities.
We build a novel generative model for link prediction using a dedicated design to decompose the likelihood estimation process via the Bayesian formula.
Our proposed method presents numerous advantages: (1) transferability across datasets without retraining, (2) promising generalization on limited training data, and (3) robustness against graph adversarial attacks.
arXiv Detail & Related papers (2024-09-13T02:23:55Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - Latent Diffusion Counterfactual Explanations [28.574246724214962]
We introduce Latent Diffusion Counterfactual Explanations (LDCE)
LDCE harnesses the capabilities of recent class- or text-conditional foundation latent diffusion models to expedite counterfactual generation.
We show how LDCE can provide insights into model errors, enhancing our understanding of black-box model behavior.
arXiv Detail & Related papers (2023-10-10T14:42:34Z) - Eliminating Lipschitz Singularities in Diffusion Models [51.806899946775076]
We show that diffusion models frequently exhibit the infinite Lipschitz near the zero point of timesteps.
This poses a threat to the stability and accuracy of the diffusion process, which relies on integral operations.
We propose a novel approach, dubbed E-TSDM, which eliminates the Lipschitz of the diffusion model near zero.
arXiv Detail & Related papers (2023-06-20T03:05:28Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons"
We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets.
Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z) - ContraFeat: Contrasting Deep Features for Semantic Discovery [102.4163768995288]
StyleGAN has shown strong potential for disentangled semantic control.
Existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results.
We propose a model that automates this process and achieves state-of-the-art semantic discovery performance.
arXiv Detail & Related papers (2022-12-14T15:22:13Z) - Formalising the Robustness of Counterfactual Explanations for Neural
Networks [16.39168719476438]
We introduce an abstraction framework based on interval neural networks to verify the robustness of CFXs.
We show how embedding Delta-robustness within existing methods can provide CFXs which are provably robust.
arXiv Detail & Related papers (2022-08-31T14:11:23Z) - An Invertible Graph Diffusion Neural Network for Source Localization [8.811725212252544]
This paper aims to establish a generic framework of invertible graph diffusion models for source localization on graphs.
Specifically, we propose a graph residual scenario to make existing graph diffusion models invertible with theoretical guarantees.
We also develop a novel error compensation mechanism that learns to offset the errors of the inferred sources.
arXiv Detail & Related papers (2022-06-18T14:35:27Z) - GLIME: A new graphical methodology for interpretable model-agnostic
explanations [0.0]
This paper contributes to the development of a novel graphical explainability tool for black box models.
The proposed XAI methodology, termed as gLIME, provides graphical model-agnostic explanations either at the global (for the entire dataset) or the local scale (for specific data points)
arXiv Detail & Related papers (2021-07-21T08:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.