Explainability Is in the Mind of the Beholder: Establishing the
Foundations of Explainable Artificial Intelligence
- URL: http://arxiv.org/abs/2112.14466v1
- Date: Wed, 29 Dec 2021 09:21:33 GMT
- Title: Explainability Is in the Mind of the Beholder: Establishing the
Foundations of Explainable Artificial Intelligence
- Authors: Kacper Sokol and Peter Flach
- Abstract summary: We define explainability as (logical) reasoning applied to transparent insights (into black boxes) interpreted under certain background knowledge.
We revisit the trade-off between transparency and predictive power and its implications for ante-hoc and post-hoc explainers.
We discuss components of the machine learning workflow that may be in need of interpretability, building on a range of ideas from human-centred explainability.
- Score: 11.472707084860875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explainable artificial intelligence and interpretable machine learning are
research fields growing in importance. Yet, the underlying concepts remain
somewhat elusive and lack generally agreed definitions. While recent
inspiration from social sciences has refocused the work on needs and
expectations of human recipients, the field still misses a concrete
conceptualisation. We take steps towards addressing this challenge by reviewing
the philosophical and social foundations of human explainability, which we then
translate into the technological realm. In particular, we scrutinise the notion
of algorithmic black boxes and the spectrum of understanding determined by
explanatory processes and explainees' background knowledge. This approach
allows us to define explainability as (logical) reasoning applied to
transparent insights (into black boxes) interpreted under certain background
knowledge - a process that engenders understanding in explainees. We then
employ this conceptualisation to revisit the much disputed trade-off between
transparency and predictive power and its implications for ante-hoc and
post-hoc explainers as well as fairness and accountability engendered by
explainability. We furthermore discuss components of the machine learning
workflow that may be in need of interpretability, building on a range of ideas
from human-centred explainability, with a focus on explainees, contrastive
statements and explanatory processes. Our discussion reconciles and complements
current research to help better navigate open questions - rather than
attempting to address any individual issue - thus laying a solid foundation for
a grounded discussion and future progress of explainable artificial
intelligence and interpretable machine learning. We conclude with a summary of
our findings, revisiting the human-centred explanatory process needed to
achieve the desired level of algorithmic transparency.
Related papers
- Explainers' Mental Representations of Explainees' Needs in Everyday Explanations [0.0]
In explanations, explainers have mental representations of explainees' developing knowledge and shifting interests regarding the explanandum.
XAI should be able to react to explainees' needs in a similar manner.
This study investigated explainers' mental representations in everyday explanations of technological artifacts.
arXiv Detail & Related papers (2024-11-13T10:53:07Z) - Forms of Understanding of XAI-Explanations [2.887772793510463]
This article aims to present a model of forms of understanding in the context of Explainable Artificial Intelligence (XAI)
Two types of understanding are considered as possible outcomes of explanations, namely enabledness and comprehension.
Special challenges of understanding in XAI are discussed.
arXiv Detail & Related papers (2023-11-15T08:06:51Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Interpretability is in the Mind of the Beholder: A Causal Framework for
Human-interpretable Representation Learning [22.201878275784246]
Focus in Explainable AI is shifting from explanations defined in terms of low-level elements, such as input features, to explanations encoded in terms of interpretable concepts learned from data.
How to reliably acquire such concepts is, however, still fundamentally unclear.
We propose a mathematical framework for acquiring interpretable representations suitable for both post-hoc explainers and concept-based neural networks.
arXiv Detail & Related papers (2023-09-14T14:26:20Z) - Mind the Gap! Bridging Explainable Artificial Intelligence and Human Understanding with Luhmann's Functional Theory of Communication [5.742215677251865]
We apply social systems theory to highlight challenges in explainable artificial intelligence.
We aim to reinvigorate the technical research in the direction of interactive and iterative explainers.
arXiv Detail & Related papers (2023-02-07T13:31:02Z) - Sensible AI: Re-imagining Interpretability and Explainability using
Sensemaking Theory [14.35488479818285]
We propose an alternate framework for interpretability grounded in Weick's sensemaking theory.
We use an application of sensemaking in organizations as a template for discussing design guidelines for Sensible AI.
arXiv Detail & Related papers (2022-05-10T17:20:44Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - Interpretable Deep Learning: Interpretations, Interpretability,
Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused.
We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy.
We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z) - Neuro-symbolic Architectures for Context Understanding [59.899606495602406]
We propose the use of hybrid AI methodology as a framework for combining the strengths of data-driven and knowledge-driven approaches.
Specifically, we inherit the concept of neuro-symbolism as a way of using knowledge-bases to guide the learning progress of deep neural networks.
arXiv Detail & Related papers (2020-03-09T15:04:07Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.