Relational Local Explanations
- URL: http://arxiv.org/abs/2212.12374v1
- Date: Fri, 23 Dec 2022 14:46:23 GMT
- Title: Relational Local Explanations
- Authors: Vadim Borisov and Gjergji Kasneci
- Abstract summary: We develop a novel model-agnostic and permutation-based feature attribution algorithm based on relational analysis between input variables.
We are able to gain a broader insight into machine learning model decisions and data.
- Score: 11.679389861042
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The majority of existing post-hoc explanation approaches for machine learning
models produce independent per-variable feature attribution scores, ignoring a
critical characteristic, such as the inter-variable relationship between
features that naturally occurs in visual and textual data. In response, we
develop a novel model-agnostic and permutation-based feature attribution
algorithm based on the relational analysis between input variables. As a
result, we are able to gain a broader insight into machine learning model
decisions and data. This type of local explanation measures the effects of
interrelationships between local features, which provides another critical
aspect of explanations. Experimental evaluations of our framework using setups
involving both image and text data modalities demonstrate its effectiveness and
validity.
Related papers
- Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions.
This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z) - Interpretability with full complexity by constraining feature
information [1.52292571922932]
Interpretability is a pressing issue for machine learning.
We approach interpretability from a new angle: constrain the information about the features without restricting the complexity of the model.
We develop a framework for extracting insight from the spectrum of approximate models.
arXiv Detail & Related papers (2022-11-30T18:59:01Z) - A Functional Information Perspective on Model Interpretation [30.101107406343665]
This work suggests a theoretical framework for model interpretability.
We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information.
We show that our method surpasses existing interpretability sampling-based methods on various data signals.
arXiv Detail & Related papers (2022-06-12T09:24:45Z) - SOInter: A Novel Deep Energy Based Interpretation Method for Explaining
Structured Output Models [6.752231769293388]
We propose a novel interpretation technique to explain the behavior of structured output models.
We focus on one of the outputs as the target and try to find the most important features utilized by the structured model to decide on the target in each locality of the input space.
arXiv Detail & Related papers (2022-02-20T21:57:07Z) - REPID: Regional Effect Plots with implicit Interaction Detection [0.9023847175654603]
Interpretable machine learning methods visualize marginal feature effects but may lead to misleading interpretations when feature interactions are present.
We introduce implicit interaction detection, a novel framework to detect interactions between a feature of interest and other features.
The framework also quantifies the strength of interactions and provides interpretable and distinct regions in which feature effects can be interpreted more reliably.
arXiv Detail & Related papers (2022-02-15T08:54:00Z) - Transforming Feature Space to Interpret Machine Learning Models [91.62936410696409]
This contribution proposes a novel approach that interprets machine-learning models through the lens of feature space transformations.
It can be used to enhance unconditional as well as conditional post-hoc diagnostic tools.
A case study on remote-sensing landcover classification with 46 features is used to demonstrate the potential of the proposed approach.
arXiv Detail & Related papers (2021-04-09T10:48:11Z) - Triplot: model agnostic measures and visualisations for variable
importance in predictive models that take into account the hierarchical
correlation structure [3.0036519884678894]
We propose new methods to support model analysis by exploiting the information about the correlation between variables.
We show how to analyze groups of variables (aspects) both when they are proposed by the user and when they should be determined automatically.
We also present the new type of model visualisation, triplot, which exploits a hierarchical structure of variable grouping to produce a high information density model visualisation.
arXiv Detail & Related papers (2021-04-07T21:29:03Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.