Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set
- URL: http://arxiv.org/abs/2110.13369v3
- Date: Thu, 28 Dec 2023 21:13:20 GMT
- Title: Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set
- Authors: Gabriel Laberge, Yann Pequignot, Alexandre Mathieu, Foutse Khomh,
Mario Marchand
- Abstract summary: Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
- Score: 50.67431815647126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-hoc global/local feature attribution methods are progressively being
employed to understand the decisions of complex machine learning models. Yet,
because of limited amounts of data, it is possible to obtain a diversity of
models with good empirical performance but that provide very different
explanations for the same prediction, making it hard to derive insight from
them. In this work, instead of aiming at reducing the under-specification of
model explanations, we fully embrace it and extract logical statements about
feature attributions that are consistent across all models with good empirical
performance (i.e. all models in the Rashomon Set). We show that partial orders
of local/global feature importance arise from this methodology enabling more
nuanced interpretations by allowing pairs of features to be incomparable when
there is no consensus on their relative importance. We prove that every
relation among features present in these partial orders also holds in the
rankings provided by existing approaches. Finally, we present three use cases
employing hypothesis spaces with tractable Rashomon Sets (Additive models,
Kernel Ridge, and Random Forests) and show that partial orders allow one to
extract consistent local and global interpretations of models despite their
under-specification.
Related papers
- When factorization meets argumentation: towards argumentative explanations [0.0]
We propose a novel model that combines factorization-based methods with argumentation frameworks (AFs)
Our framework seamlessly incorporates side information, such as user contexts, leading to more accurate predictions.
arXiv Detail & Related papers (2024-05-13T19:16:28Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Consistent Explanations in the Face of Model Indeterminacy via
Ensembling [12.661530681518899]
This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy.
We introduce ensemble methods to enhance the consistency of the explanations provided in these scenarios.
Our findings highlight the importance of considering model indeterminacy when interpreting explanations.
arXiv Detail & Related papers (2023-06-09T18:45:43Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Cross-Model Consensus of Explanations and Beyond for Image
Classification Models: An Empirical Study [34.672716006357675]
Among different sets of features, some common features might be used by the majority of models.
We propose the cross-model consensus of explanations to capture the common features.
We conduct extensive experiments using 80+ models on 5 datasets/tasks.
arXiv Detail & Related papers (2021-09-02T04:50:45Z) - Information-theoretic Evolution of Model Agnostic Global Explanations [10.921146104622972]
We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data.
Our approach has been deployed in a leading digital marketing suite of products.
arXiv Detail & Related papers (2021-05-14T16:52:16Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Unsupervised Learning of Global Factors in Deep Generative Models [6.362733059568703]
We present a novel deep generative model based on non i.i.d. variational autoencoders.
We show that the model performs domain alignment to find correlations and interpolate between different databases.
We also study the ability of the global space to discriminate between groups of observations with non-trivial underlying structures.
arXiv Detail & Related papers (2020-12-15T11:55:31Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.