Weighted Integrated Gradients for Feature Attribution
- URL: http://arxiv.org/abs/2505.03201v2
- Date: Sat, 31 May 2025 14:07:09 GMT
- Title: Weighted Integrated Gradients for Feature Attribution
- Authors: Kien Tran Duc Tuan, Tam Nguyen Trong, Son Nguyen Hoang, Khoat Than, Anh Nguyen Duc,
- Abstract summary: In explainable AI, Integrated Gradients (IG) is a widely adopted technique for assessing the significance of feature attributes of the input on model outputs.<n>This study argues that baselines should not be treated equivalently.<n>We introduce Weighted Integrated Gradients (WG), a novel approach that unsupervisedly evaluates baseline suitability and incorporates a strategy for selecting effective baselines.
- Score: 2.3226745625632947
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In explainable AI, Integrated Gradients (IG) is a widely adopted technique for assessing the significance of feature attributes of the input on model outputs by evaluating contributions from a baseline input to the current input. The choice of the baseline input significantly influences the resulting explanation. While the traditional Expected Gradients (EG) method assumes baselines can be uniformly sampled and averaged with equal weights, this study argues that baselines should not be treated equivalently. We introduce Weighted Integrated Gradients (WG), a novel approach that unsupervisedly evaluates baseline suitability and incorporates a strategy for selecting effective baselines. Theoretical analysis demonstrates that WG satisfies essential explanation method criteria and offers greater stability than prior approaches. Experimental results further confirm that WG outperforms EG across diverse scenarios, achieving an improvement of 10-35\% on main metrics. Moreover, by evaluating baselines, our method can filter a subset of effective baselines for each input to calculate explanations, maintaining high accuracy while reducing computational cost. The code is available at: https://github.com/tamnt240904/weighted_ig.
Related papers
- Distributional encoding for Gaussian process regression with qualitative inputs [0.7342677574855652]
We show that a generalization based on distributional encoding (DE) makes use of all samples of the target variable for a category.<n>Our approach is validated empirically, and we demonstrate state-of-the-art predictive performance on a variety of synthetic and real-world datasets.
arXiv Detail & Related papers (2025-06-05T09:35:02Z) - Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits [58.63897489864948]
Reinforcement learning with outcome-based feedback faces a fundamental challenge.<n>How do we assign credit to the right actions?<n>This paper provides the first comprehensive analysis of this problem in online RL with general function approximation.
arXiv Detail & Related papers (2025-05-26T17:44:08Z) - AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset [95.45316956434608]
Preference learning is critical for aligning large language models with human values.<n>Our work shifts preference dataset design from ad hoc scaling to component-aware optimization.
arXiv Detail & Related papers (2025-04-04T17:33:07Z) - Shapley-Guided Utility Learning for Effective Graph Inference Data Valuation [6.542796128290513]
We propose Shapley-Guided Utility Learning (SGUL), a novel framework for graph inference data valuation.<n>SGUL combines transferable data-specific and modelspecific features to approximate test accuracy without relying on ground truth labels.<n>We show that SGUL consistently outperforms existing baselines in both inductive and transductive settings.
arXiv Detail & Related papers (2025-03-23T20:35:03Z) - Unlearning-based Neural Interpretations [51.99182464831169]
We show that current baselines defined using static functions are biased, fragile and manipulable.<n>We propose UNI to compute an (un)learnable, debiased and adaptive baseline by perturbing the input towards an unlearning direction of steepest ascent.
arXiv Detail & Related papers (2024-10-10T16:02:39Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Backdoor-based Explainable AI Benchmark for High Fidelity Evaluation of Attribution Methods [49.62131719441252]
Attribution methods compute importance scores for input features to explain the output predictions of deep models.
In this work, we first identify a set of fidelity criteria that reliable benchmarks for attribution methods are expected to fulfill.
We then introduce a Backdoor-based eXplainable AI benchmark (BackX) that adheres to the desired fidelity criteria.
arXiv Detail & Related papers (2024-05-02T13:48:37Z) - A New Baseline Assumption of Integated Gradients Based on Shaply value [31.2051113305947]
Integrated Gradients (IG) is a technique for mapping predictions back to the input features of deep neural networks (DNNs)
We argue that the standard approach of utilizing a single baseline is frequently inadequate, prompting the need for multiple baselines.
We develop a new baseline method called Shapley Integrated Gradients ( SIG), which uses proportional sampling to mirror the Shapley Value process.
arXiv Detail & Related papers (2023-10-07T14:19:07Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - Scalable Bayesian Meta-Learning through Generalized Implicit Gradients [64.21628447579772]
Implicit Bayesian meta-learning (iBaML) method broadens the scope of learnable priors, but also quantifies the associated uncertainty.
Analytical error bounds are established to demonstrate the precision and efficiency of the generalized implicit gradient over the explicit one.
arXiv Detail & Related papers (2023-03-31T02:10:30Z) - Variance Reduction for Score Functions Using Optimal Baselines [0.0]
This paper studies baselines, a variance reduction technique for score functions.
Motivated primarily by reinforcement learning, we derive for the first time an expression for the optimal state-dependent baseline.
arXiv Detail & Related papers (2022-12-27T19:17:28Z) - Maximum Entropy Baseline for Integrated Gradients [0.0]
Integrated Gradients (IG) is one of the most popular explainability methods available.
This study proposes a new uniform baseline, i.e., the Maximum Entropy Baseline.
arXiv Detail & Related papers (2022-04-12T17:04:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.