Related papers: Distribution-Based Feature Attribution for Explaining the Predictions of Any Classifier

Distribution-Based Feature Attribution for Explaining the Predictions of Any Classifier

URL: http://arxiv.org/abs/2511.09332v1
Date: Thu, 13 Nov 2025 01:47:00 GMT
Title: Distribution-Based Feature Attribution for Explaining the Predictions of Any Classifier
Authors: Xinpeng Li, Kai Ming Ting,
Abstract summary: This paper introduces a formal definition for the problem of feature attribution, which stipulates explanations be supported by an underlying probability distribution.<n>We propose Distributional Feature Attribution eXplanations (DFAX), a novel, model-agnostic method for feature attribution.<n>We show through extensive experiments that DFAX is more effective and efficient than state-of-the-art baselines.
Score: 6.452573834050412
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The proliferation of complex, black-box AI models has intensified the need for techniques that can explain their decisions. Feature attribution methods have become a popular solution for providing post-hoc explanations, yet the field has historically lacked a formal problem definition. This paper addresses this gap by introducing a formal definition for the problem of feature attribution, which stipulates that explanations be supported by an underlying probability distribution represented by the given dataset. Our analysis reveals that many existing model-agnostic methods fail to meet this criterion, while even those that do often possess other limitations. To overcome these challenges, we propose Distributional Feature Attribution eXplanations (DFAX), a novel, model-agnostic method for feature attribution. DFAX is the first feature attribution method to explain classifier predictions directly based on the data distribution. We show through extensive experiments that DFAX is more effective and efficient than state-of-the-art baselines.

Related papers

Partial Transportability for Domain Generalization [56.37032680901525]
Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution.<n>Our contribution is to provide the first general estimation technique for transportability problems.<n>We propose a gradient-based optimization scheme for making scalable inferences in practice.
arXiv Detail & Related papers (2025-03-30T22:06:37Z)
Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training.<n>We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches.<n>We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z)
Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.<n>Yet their widespread adoption poses challenges regarding data attribution and interpretability.<n>We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.<n>We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.<n>Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
It's All in the Mix: Wasserstein Classification and Regression with Mixed Features [2.2685251390114565]
We develop and analyze distributionally robust prediction models that faithfully account for the presence of discrete features.<n>We demonstrate that our models can significantly outperform existing methods that are agnostic to the presence of discrete features.
arXiv Detail & Related papers (2023-12-19T15:15:52Z)
Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data. Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data. We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z)
On Formal Feature Attribution and Its Approximation [37.3078859524959]
This paper proposes a way to apply the apparatus of formal XAI to the case of feature attribution based on formal explanation enumeration. Given the practical complexity of the problem, the paper then proposes an efficient technique for approximating exact FFA.
arXiv Detail & Related papers (2023-07-07T04:20:36Z)
Bounding Counterfactuals under Selection Bias [60.55840896782637]
We propose a first algorithm to address both identifiable and unidentifiable queries. We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal.
arXiv Detail & Related papers (2022-07-26T10:33:10Z)
Interpretable Data-Based Explanations for Fairness Debugging [7.266116143672294]
Gopher is a system that produces compact, interpretable, and causal explanations for bias or unexpected model behavior. We introduce the concept of causal responsibility that quantifies the extent to which intervening on training data by removing or updating subsets of it can resolve the bias. Building on this concept, we develop an efficient approach for generating the top-k patterns that explain model bias.
arXiv Detail & Related papers (2021-12-17T20:10:00Z)
Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End [17.226134854746267]
We present a method to generate feature attribution explanations from a set of counterfactual examples. We show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency.
arXiv Detail & Related papers (2020-11-10T05:41:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.