Related papers: IG2: Integrated Gradient on Iterative Gradient Path for Feature Attribution

IG2: Integrated Gradient on Iterative Gradient Path for Feature Attribution

URL: http://arxiv.org/abs/2406.10852v1
Date: Sun, 16 Jun 2024 08:48:03 GMT
Title: IG2: Integrated Gradient on Iterative Gradient Path for Feature Attribution
Authors: Yue Zhuo, Zhiqiang Ge,
Abstract summary: Iterative Gradient path Integrated Gradients (IG2) is a prominent path attribution method for deep neural networks. IG2 incorporates the counterfactual gradient iteratively into the integration path, generating a novel path (GradPath) and a novel baseline (GradCF) Experimental results on XAI benchmark, ImageNet, MNIST, TREC questions answering, wafer-map failure patterns, and CelebA face attributes validate that IG2 delivers superior feature attributions.
Score: 6.278326325782819
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Feature attribution explains Artificial Intelligence (AI) at the instance level by providing importance scores of input features' contributions to model prediction. Integrated Gradients (IG) is a prominent path attribution method for deep neural networks, involving the integration of gradients along a path from the explained input (explicand) to a counterfactual instance (baseline). Current IG variants primarily focus on the gradient of explicand's output. However, our research indicates that the gradient of the counterfactual output significantly affects feature attribution as well. To achieve this, we propose Iterative Gradient path Integrated Gradients (IG2), considering both gradients. IG2 incorporates the counterfactual gradient iteratively into the integration path, generating a novel path (GradPath) and a novel baseline (GradCF). These two novel IG components effectively address the issues of attribution noise and arbitrary baseline choice in earlier IG methods. IG2, as a path method, satisfies many desirable axioms, which are theoretically justified in the paper. Experimental results on XAI benchmark, ImageNet, MNIST, TREC questions answering, wafer-map failure patterns, and CelebA face attributes validate that IG2 delivers superior feature attributions compared to the state-of-the-art techniques. The code is released at: https://github.com/JoeZhuo-ZY/IG2.

Related papers

GradMetaNet: An Equivariant Architecture for Learning on Gradients [18.350495600116712]
We introduce GradMetaNet, a novel architecture for learning on gradients.<n>We prove results for GradMetaNet, and show that previous approaches cannot approximate natural gradient-based functions.<n>We then demonstrate GradMetaNet's effectiveness on a diverse set of gradient-based tasks.
arXiv Detail & Related papers (2025-07-02T12:22:39Z)
Pave Your Own Path: Graph Gradual Domain Adaptation on Fused Gromov-Wasserstein Geodesics [59.07903030446756]
Graph neural networks are highly vulnerable to distribution shifts on graphs.<n>We present Gadget, the first framework for non-IID graph data.<n> Gadget can be seamlessly integrated with existing graph DA methods to handle large shifts on graphs.
arXiv Detail & Related papers (2025-05-19T05:03:58Z)
Using the Path of Least Resistance to Explain Deep Networks [5.614094161229764]
Integrated Gradients (IG) is a widely used axiomatic path-based attribution method. We show that straight paths can lead to flawed attributions. We propose Geodesic Integrated Gradients (GIG) as an alternative.
arXiv Detail & Related papers (2025-02-17T18:29:24Z)
GSINA: Improving Subgraph Extraction for Graph Invariant Learning via Graph Sinkhorn Attention [52.67633391931959]
Graph invariant learning (GIL) has been an effective approach to discovering the invariant relationships between graph data and its labels. We propose a novel graph attention mechanism called Graph Sinkhorn Attention (GSINA) GSINA is able to obtain meaningful, differentiable invariant subgraphs with controllable sparsity and softness.
arXiv Detail & Related papers (2024-02-11T12:57:16Z)
Neural Gradient Learning and Optimization for Oriented Point Normal Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation. We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors. Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z)
Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision [9.385886214196479]
We propose an attribution algorithm called integrated decision gradients (IDG) IDG focuses on integrating gradients from the region of the path where the model makes its decision, i.e., the portion of the path where the output logit rapidly transitions from zero to its final value. We minimize the errors within the sum approximation of the path integral by utilizing non-uniform subdivisions determined by adaptive sampling.
arXiv Detail & Related papers (2023-05-31T17:25:12Z)
Unifying gradient regularization for Heterogeneous Graph Neural Networks [6.3093033645568015]
We propose a novel gradient regularization method called Grug, which iteratively applies regularization to the gradients generated by both propagated messages and the node features during the message-passing process. Grug provides a unified framework integrating graph topology and node features, based on which we conduct a detailed theoretical analysis of their effectiveness.
arXiv Detail & Related papers (2023-05-25T07:47:42Z)
Gradient Gating for Deep Multi-Rate Learning on Graphs [62.25886489571097]
We present Gradient Gating (G$2$), a novel framework for improving the performance of Graph Neural Networks (GNNs) Our framework is based on gating the output of GNN layers with a mechanism for multi-rate flow of message passing information across nodes of the underlying graph.
arXiv Detail & Related papers (2022-10-02T13:19:48Z)
Gradient Correction beyond Gradient Descent [63.33439072360198]
gradient correction is apparently the most crucial aspect for the training of a neural network. We introduce a framework (textbfGCGD) to perform gradient correction. Experiment results show that our gradient correction framework can effectively improve the gradient quality to reduce training epochs by $sim$ 20% and also improve the network performance.
arXiv Detail & Related papers (2022-03-16T01:42:25Z)
TSG: Target-Selective Gradient Backprop for Probing CNN Visual Saliency [72.9106103283475]
We study the visual saliency, a.k.a. visual explanation, to interpret convolutional neural networks. Inspired by those observations, we propose a novel visual saliency framework, termed Target-Selective Gradient (TSG) backprop. The proposed TSG consists of two components, namely, TSG-Conv and TSG-FC, which rectify the gradients for convolutional layers and fully-connected layers, respectively.
arXiv Detail & Related papers (2021-10-11T12:00:20Z)
Discretized Integrated Gradients for Explaining Language Models [43.2877233809206]
Integrated Gradients (IG) is a prominent attribution-based explanation algorithm. We propose Discretized Integrated Gradients (DIG) which allows effective attribution along non-linear paths.
arXiv Detail & Related papers (2021-08-31T07:36:34Z)
Guided Integrated Gradients: An Adaptive Path Method for Removing Noise [9.792727625917083]
Integrated Gradients (IG) is a commonly used feature attribution method for deep neural networks. We show that one of the causes of the problem is the accumulation of noise along the IG path. We propose adapting the attribution path itself -- conditioning the path not just on the image but also on the model being explained.
arXiv Detail & Related papers (2021-06-17T20:00:55Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.