Comparing Explanation Methods for Traditional Machine Learning Models
Part 2: Quantifying Model Explainability Faithfulness and Improvements with
Dimensionality Reduction
- URL: http://arxiv.org/abs/2211.10378v1
- Date: Fri, 18 Nov 2022 17:15:59 GMT
- Title: Comparing Explanation Methods for Traditional Machine Learning Models
Part 2: Quantifying Model Explainability Faithfulness and Improvements with
Dimensionality Reduction
- Authors: Montgomery Flora, Corey Potvin, Amy McGovern, Shawn Handler
- Abstract summary: "faithfulness" or "fidelity" refer to the correspondence between the assigned feature importance and the contribution of the feature to model performance.
This study is one of the first to quantify the improvement in explainability from limiting correlated features and knowing the relative fidelity of different explainability methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine learning (ML) models are becoming increasingly common in the
atmospheric science community with a wide range of applications. To enable
users to understand what an ML model has learned, ML explainability has become
a field of active research. In Part I of this two-part study, we described
several explainability methods and demonstrated that feature rankings from
different methods can substantially disagree with each other. It is unclear,
though, whether the disagreement is overinflated due to some methods being less
faithful in assigning importance. Herein, "faithfulness" or "fidelity" refer to
the correspondence between the assigned feature importance and the contribution
of the feature to model performance. In the present study, we evaluate the
faithfulness of feature ranking methods using multiple methods. Given the
sensitivity of explanation methods to feature correlations, we also quantify
how much explainability faithfulness improves after correlated features are
limited. Before dimensionality reduction, the feature relevance methods [e.g.,
SHAP, LIME, ALE variance, and logistic regression (LR) coefficients] were
generally more faithful than the permutation importance methods due to the
negative impact of correlated features. Once correlated features were reduced,
traditional permutation importance became the most faithful method. In
addition, the ranking uncertainty (i.e., the spread in rank assigned to a
feature by the different ranking methods) was reduced by a factor of 2-10, and
excluding less faithful feature ranking methods reduces it further. This study
is one of the first to quantify the improvement in explainability from limiting
correlated features and knowing the relative fidelity of different
explainability methods.
Related papers
- Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Comparing Explanation Methods for Traditional Machine Learning Models
Part 1: An Overview of Current Methods and Quantifying Their Disagreement [0.0]
This study distinguishes explainability from interpretability, local from global explainability, and feature importance versus feature relevance.
We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers to explore these products.
arXiv Detail & Related papers (2022-11-16T14:45:16Z) - Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency
Methods [0.15039745292757667]
We show that saliency methods exhibit weak rank correlations even when applied to the same model instance.
Regularization techniques that increase faithfulness of attention explanations also increase agreement between saliency methods.
arXiv Detail & Related papers (2022-11-15T18:18:34Z) - The impact of feature importance methods on the interpretation of defect
classifiers [13.840006058766766]
We evaluate the agreement between the feature importance ranks associated with the studied classifiers through a case study of 18 software projects and six commonly used classifiers.
The computed feature importance ranks by the studied CA methods exhibit a strong agreement including the features reported at top-1 and top-3 ranks for a given dataset.
We demonstrate that removing these feature interactions, even with simple methods like CFS improves agreement between the computed feature importance ranks CA and CS methods.
arXiv Detail & Related papers (2022-02-04T21:00:59Z) - Search Methods for Sufficient, Socially-Aligned Feature Importance
Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time.
We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z) - Order in the Court: Explainable AI Methods Prone to Disagreement [0.0]
In Natural Language Processing, feature-additive explanation methods quantify the independent contribution of each input token towards a model's decision.
Previous analyses have sought to either invalidate or support the role of attention-based explanations as a faithful and plausible measure of salience.
We show that rank correlation is largely uninformative and does not measure the quality of feature-additive methods.
arXiv Detail & Related papers (2021-05-07T14:27:37Z) - Fundamental Limits and Tradeoffs in Invariant Representation Learning [99.2368462915979]
Many machine learning applications involve learning representations that achieve two competing goals.
Minimax game-theoretic formulation represents a fundamental tradeoff between accuracy and invariance.
We provide an information-theoretic analysis of this general and important problem under both classification and regression settings.
arXiv Detail & Related papers (2020-12-19T15:24:04Z) - Towards Unifying Feature Attribution and Counterfactual Explanations:
Different Means to the Same End [17.226134854746267]
We present a method to generate feature attribution explanations from a set of counterfactual examples.
We show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency.
arXiv Detail & Related papers (2020-11-10T05:41:43Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Learning from Aggregate Observations [82.44304647051243]
We study the problem of learning from aggregate observations where supervision signals are given to sets of instances.
We present a general probabilistic framework that accommodates a variety of aggregate observations.
Simple maximum likelihood solutions can be applied to various differentiable models.
arXiv Detail & Related papers (2020-04-14T06:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.