Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests
- URL: http://arxiv.org/abs/2501.11178v1
- Date: Sun, 19 Jan 2025 21:34:54 GMT
- Title: Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests
- Authors: Kristin Blesch, Niklas Koenen, Jan Kapar, Pegah Golchian, Lukas Burk, Markus Loecher, Marvin N. Wright,
- Abstract summary: In explainable artificial intelligence (XAI), conditional feature importance assesses the impact of a feature on a prediction model's performance.
Recent advancements in generative modeling can facilitate measuring conditional feature importance.
This paper proposes cARFi, a method for measuring conditional feature importance through feature values sampled from ARF-estimated conditional distributions.
- Score: 1.0208529247755187
- License:
- Abstract: This paper proposes a method for measuring conditional feature importance via generative modeling. In explainable artificial intelligence (XAI), conditional feature importance assesses the impact of a feature on a prediction model's performance given the information of other features. Model-agnostic post hoc methods to do so typically evaluate changes in the predictive performance under on-manifold feature value manipulations. Such procedures require creating feature values that respect conditional feature distributions, which can be challenging in practice. Recent advancements in generative modeling can facilitate this. For tabular data, which may consist of both categorical and continuous features, the adversarial random forest (ARF) stands out as a generative model that can generate on-manifold data points without requiring intensive tuning efforts or computational resources, making it a promising candidate model for subroutines in XAI methods. This paper proposes cARFi (conditional ARF feature importance), a method for measuring conditional feature importance through feature values sampled from ARF-estimated conditional distributions. cARFi requires only little tuning to yield robust importance scores that can flexibly adapt for conditional or marginal notions of feature importance, including straightforward extensions to condition on feature subsets and allows for inferring the significance of feature importances through statistical tests.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Evaluation of Active Feature Acquisition Methods for Static Feature
Settings [6.645033437894859]
We introduce a semi-offline reinforcement learning framework for active feature acquisition performance evaluation (AFAPE)
Here, we study and extend the AFAPE problem to cover static feature settings, where features are time-invariant.
We derive and adapt new inverse probability weighting (IPW), direct method (DM), and double reinforcement learning (DRL) estimators within the semi-offline RL framework.
arXiv Detail & Related papers (2023-12-06T17:07:42Z) - Context-aware feature attribution through argumentation [0.0]
We define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA)
Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction.
arXiv Detail & Related papers (2023-10-24T20:02:02Z) - LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations [1.0370398945228227]
We introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for machine learning models.
The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features.
Our approach offers causal explanations and outperforms LIME and SHAP in terms of local accuracy and consistency of explained features.
arXiv Detail & Related papers (2023-10-01T04:09:59Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Controlling for sparsity in sparse factor analysis models: adaptive
latent feature sharing for piecewise linear dimensionality reduction [2.896192909215469]
We propose a simple and tractable parametric feature allocation model which can address key limitations of current latent feature decomposition techniques.
We derive a novel adaptive Factor analysis (aFA), as well as, an adaptive probabilistic principle component analysis (aPPCA) capable of flexible structure discovery and dimensionality reduction.
We show that aPPCA and aFA can infer interpretable high level features both when applied on raw MNIST and when applied for interpreting autoencoder features.
arXiv Detail & Related papers (2020-06-22T16:09:11Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Regularized Autoencoders via Relaxed Injective Probability Flow [35.39933775720789]
Invertible flow-based generative models are an effective method for learning to generate samples, while allowing for tractable likelihood computation and inference.
We propose a generative model based on probability flows that does away with the bijectivity requirement on the model and only assumes injectivity.
arXiv Detail & Related papers (2020-02-20T18:22:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.