Adequate and fair explanations
- URL: http://arxiv.org/abs/2001.07578v2
- Date: Sat, 21 Aug 2021 08:55:22 GMT
- Title: Adequate and fair explanations
- Authors: Nicholas Asher, Soumya Paul, Chris Russell
- Abstract summary: We focus upon the second school of exact explanations with a rigorous logical foundation.
With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit.
We explore how to move from local partial explanations to what we call complete local explanations and then to global ones.
- Score: 12.33259114006129
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explaining sophisticated machine-learning based systems is an important issue
at the foundations of AI. Recent efforts have shown various methods for
providing explanations. These approaches can be broadly divided into two
schools: those that provide a local and human interpreatable approximation of a
machine learning algorithm, and logical approaches that exactly characterise
one aspect of the decision. In this paper we focus upon the second school of
exact explanations with a rigorous logical foundation. There is an
epistemological problem with these exact methods. While they can furnish
complete explanations, such explanations may be too complex for humans to
understand or even to write down in human readable form. Interpretability
requires epistemically accessible explanations, explanations humans can grasp.
Yet what is a sufficiently complete epistemically accessible explanation still
needs clarification. We do this here in terms of counterfactuals, following
[Wachter et al., 2017]. With counterfactual explanations, many of the
assumptions needed to provide a complete explanation are left implicit. To do
so, counterfactual explanations exploit the properties of a particular data
point or sample, and as such are also local as well as partial explanations. We
explore how to move from local partial explanations to what we call complete
local explanations and then to global ones. But to preserve accessibility we
argue for the need for partiality. This partiality makes it possible to hide
explicit biases present in the algorithm that may be injurious or unfair.We
investigate how easy it is to uncover these biases in providing complete and
fair explanations by exploiting the structure of the set of counterfactuals
providing a complete local explanation.
Related papers
- Disagreement amongst counterfactual explanations: How transparency can
be deceptive [0.0]
Counterfactual explanations are increasingly used as Explainable Artificial Intelligence technique.
Not every algorithm creates uniform explanations for the same instance.
Ethical issues arise when malicious agents use this diversity to fairwash an unfair machine learning model.
arXiv Detail & Related papers (2023-04-25T09:15:37Z) - NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning [59.16962123636579]
This paper proposes a new take on Prolog-based inference engines.
We replace handcrafted rules with a combination of neural language modeling, guided generation, and semi dense retrieval.
Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA.
arXiv Detail & Related papers (2022-09-16T00:54:44Z) - Eliminating The Impossible, Whatever Remains Must Be True [46.39428193548396]
We show how one can apply background knowledge to give more succinct "why" formal explanations.
We also show how to use existing rule induction techniques to efficiently extract background information from a dataset.
arXiv Detail & Related papers (2022-06-20T03:18:14Z) - Explanatory Paradigms in Neural Networks [18.32369721322249]
We present a leap-forward expansion to the study of explainability in neural networks by considering explanations as answers to reasoning-based questions.
The answers to these questions are observed correlations, observed counterfactuals, and observed contrastive explanations respectively.
The term observed refers to the specific case of post-hoc explainability, when an explanatory technique explains the decision $P$ after a trained neural network has made the decision $P$.
arXiv Detail & Related papers (2022-02-24T00:22:11Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - Counterfactual Instances Explain Little [7.655239948659383]
It is important to be able to explain the decisions of machine learning systems.
An increasingly popular approach has been to seek to provide emphcounterfactual instance explanations.
This paper will argue that a satisfactory explanation must consist of both counterfactual instances and a causal equation.
arXiv Detail & Related papers (2021-09-20T19:40:25Z) - Semantics and explanation: why counterfactual explanations produce
adversarial examples in deep neural networks [15.102346715690759]
Recent papers in explainable AI have made a compelling case for counterfactual modes of explanation.
While counterfactual explanations appear to be extremely effective in some instances, they are formally equivalent to adversarial examples.
This presents an apparent paradox for explainability researchers: if these two procedures are formally equivalent, what accounts for the explanatory divide apparent between counterfactual explanations and adversarial examples?
We resolve this paradox by placing emphasis back on the semantics of counterfactual expressions.
arXiv Detail & Related papers (2020-12-18T07:04:04Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z) - SCOUT: Self-aware Discriminant Counterfactual Explanations [78.79534272979305]
The problem of counterfactual visual explanations is considered.
A new family of discriminant explanations is introduced.
The resulting counterfactual explanations are optimization free and thus much faster than previous methods.
arXiv Detail & Related papers (2020-04-16T17:05:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.