Understanding Disparities in Post Hoc Machine Learning Explanation
- URL: http://arxiv.org/abs/2401.14539v1
- Date: Thu, 25 Jan 2024 22:09:28 GMT
- Title: Understanding Disparities in Post Hoc Machine Learning Explanation
- Authors: Vishwali Mhasawade, Salman Rahman, Zoe Haskell-Craig, Rumi Chunara
- Abstract summary: Previous work has highlighted that existing post-hoc explanation methods exhibit disparities in explanation fidelity (across 'race' and 'gender' as sensitive attributes)
We specifically assess challenges to explanation disparities that originate from properties of the data.
Results indicate that disparities in model explanations can also depend on data and model properties.
- Score: 2.965442487094603
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous work has highlighted that existing post-hoc explanation methods
exhibit disparities in explanation fidelity (across 'race' and 'gender' as
sensitive attributes), and while a large body of work focuses on mitigating
these issues at the explanation metric level, the role of the data generating
process and black box model in relation to explanation disparities remains
largely unexplored. Accordingly, through both simulations as well as
experiments on a real-world dataset, we specifically assess challenges to
explanation disparities that originate from properties of the data: limited
sample size, covariate shift, concept shift, omitted variable bias, and
challenges based on model properties: inclusion of the sensitive attribute and
appropriate functional form. Through controlled simulation analyses, our study
demonstrates that increased covariate shift, concept shift, and omission of
covariates increase explanation disparities, with the effect pronounced higher
for neural network models that are better able to capture the underlying
functional form in comparison to linear models. We also observe consistent
findings regarding the effect of concept shift and omitted variable bias on
explanation disparities in the Adult income dataset. Overall, results indicate
that disparities in model explanations can also depend on data and model
properties. Based on this systematic investigation, we provide recommendations
for the design of explanation methods that mitigate undesirable disparities.
Related papers
- Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection [0.0]
We study the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, and the implausible utility of explanations.
Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees.
We find that feature-based model explanations are most often inconsistent across different settings.
arXiv Detail & Related papers (2024-07-04T15:35:42Z) - Toward Understanding the Disagreement Problem in Neural Network Feature Attribution [0.8057006406834466]
neural networks have demonstrated their remarkable ability to discern intricate patterns and relationships from raw data.
Understanding the inner workings of these black box models remains challenging, yet crucial for high-stake decisions.
Our work addresses this confusion by investigating the explanations' fundamental and distributional behavior.
arXiv Detail & Related papers (2024-04-17T12:45:59Z) - CNN-based explanation ensembling for dataset, representation and explanations evaluation [1.1060425537315088]
We explore the potential of ensembling explanations generated by deep classification models using convolutional model.
Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior.
arXiv Detail & Related papers (2024-04-16T08:39:29Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - From Identifiable Causal Representations to Controllable Counterfactual Generation: A Survey on Causal Generative Modeling [17.074858228123706]
We focus on fundamental theory, methodology, drawbacks, datasets, and metrics.
We cover applications of causal generative models in fairness, privacy, out-of-distribution generalization, precision medicine, and biological sciences.
arXiv Detail & Related papers (2023-10-17T05:45:32Z) - Are Data-driven Explanations Robust against Out-of-distribution Data? [18.760475318852375]
We propose an end-to-end model-agnostic learning framework Distributionally Robust Explanations (DRE)
Key idea is to fully utilize the inter-distribution information to provide supervisory signals for the learning of explanations without human annotation.
Our results demonstrate that the proposed method significantly improves the model's performance in terms of explanation and prediction robustness against distributional shifts.
arXiv Detail & Related papers (2023-03-29T02:02:08Z) - Explainability in Process Outcome Prediction: Guidelines to Obtain
Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction.
This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Explainers in the Wild: Making Surrogate Explainers Robust to
Distortions through Perception [77.34726150561087]
We propose a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances.
We generate explanations for images in the Imagenet-C dataset and demonstrate how using a perceptual distances in the surrogate explainer creates more coherent explanations for the distorted and reference images.
arXiv Detail & Related papers (2021-02-22T12:38:53Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.