Causal Interpretability for Machine Learning -- Problems, Methods and
Evaluation
- URL: http://arxiv.org/abs/2003.03934v3
- Date: Thu, 19 Mar 2020 20:23:46 GMT
- Title: Causal Interpretability for Machine Learning -- Problems, Methods and
Evaluation
- Authors: Raha Moraffah, Mansooreh Karami, Ruocheng Guo, Adrienne Raglin, Huan
Liu
- Abstract summary: Most machine learning models are black-boxes, and it is obscure how the decisions are made by them.
To provide insights into the decision making processes of these models, a variety of traditional interpretable models have been proposed.
In this work, models that aim to answer causal questions are referred to as causal interpretable models.
- Score: 30.664546254344593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models have had discernible achievements in a myriad of
applications. However, most of these models are black-boxes, and it is obscure
how the decisions are made by them. This makes the models unreliable and
untrustworthy. To provide insights into the decision making processes of these
models, a variety of traditional interpretable models have been proposed.
Moreover, to generate more human-friendly explanations, recent work on
interpretability tries to answer questions related to causality such as "Why
does this model makes such decisions?" or "Was it a specific feature that
caused the decision made by the model?". In this work, models that aim to
answer causal questions are referred to as causal interpretable models. The
existing surveys have covered concepts and methodologies of traditional
interpretability. In this work, we present a comprehensive survey on causal
interpretable models from the aspects of the problems and methods. In addition,
this survey provides in-depth insights into the existing evaluation metrics for
measuring interpretability, which can help practitioners understand for what
scenarios each evaluation metric is suitable.
Related papers
- Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Towards Interpretable Deep Reinforcement Learning Models via Inverse
Reinforcement Learning [27.841725567976315]
We propose a novel framework utilizing Adversarial Inverse Reinforcement Learning.
This framework provides global explanations for decisions made by a Reinforcement Learning model.
We capture intuitive tendencies that the model follows by summarizing the model's decision-making process.
arXiv Detail & Related papers (2022-03-30T17:01:59Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - When and How to Fool Explainable Models (and Humans) with Adversarial
Examples [1.439518478021091]
We explore the possibilities and limits of adversarial attacks for explainable machine learning models.
First, we extend the notion of adversarial examples to fit in explainable machine learning scenarios.
Next, we propose a comprehensive framework to study whether adversarial examples can be generated for explainable models.
arXiv Detail & Related papers (2021-07-05T11:20:55Z) - Individual Explanations in Machine Learning Models: A Case Study on
Poverty Estimation [63.18666008322476]
Machine learning methods are being increasingly applied in sensitive societal contexts.
The present case study has two main objectives. First, to expose these challenges and how they affect the use of relevant and novel explanations methods.
And second, to present a set of strategies that mitigate such challenges, as faced when implementing explanation methods in a relevant application domain.
arXiv Detail & Related papers (2021-04-09T01:54:58Z) - Individual Explanations in Machine Learning Models: A Survey for
Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise.
Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways.
Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.