Explanation Beyond Intuition: A Testable Criterion for Inherent Explainability
- URL: http://arxiv.org/abs/2512.17316v1
- Date: Fri, 19 Dec 2025 07:59:36 GMT
- Title: Explanation Beyond Intuition: A Testable Criterion for Inherent Explainability
- Authors: Michael Merry, Pat Riddle, Jim Warren,
- Abstract summary: Inherent explainability is the gold standard in Explainable Artificial Intelligence (XAI)<n>There is not a consistent definition or test to demonstrate inherent explainability.<n>We propose a globally applicable criterion for inherent explainability.
- Score: 0.2580765958706854
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inherent explainability is the gold standard in Explainable Artificial Intelligence (XAI). However, there is not a consistent definition or test to demonstrate inherent explainability. Work to date either characterises explainability through metrics, or appeals to intuition - "we know it when we see it". We propose a globally applicable criterion for inherent explainability. The criterion uses graph theory for representing and decomposing models for structure-local explanation, and recomposing them into global explanations. We form the structure-local explanations as annotations, a verifiable hypothesis-evidence structure that allows for a range of explanatory methods to be used. This criterion matches existing intuitions on inherent explainability, and provides justifications why a large regression model may not be explainable but a sparse neural network could be. We differentiate explainable -- a model that allows for explanation -- and \textit{explained} -- one that has a verified explanation. Finally, we provide a full explanation of PREDICT -- a Cox proportional hazards model of cardiovascular disease risk, which is in active clinical use in New Zealand. It follows that PREDICT is inherently explainable. This work provides structure to formalise other work on explainability, and allows regulators a flexible but rigorous test that can be used in compliance frameworks.
Related papers
- Counterfactual explainability and analysis of variance [3.4895986723227383]
Existing tools for explaining complex models and systems are associational rather than causal.<n>We propose a new notion called counterfactual explainability for causal attribution.
arXiv Detail & Related papers (2024-11-03T16:29:09Z) - Verifying Relational Explanations: A Probabilistic Approach [2.113770213797994]
We develop an approach where we assess the uncertainty in explanations generated by GNNExplainer.
We learn a factor graph model to quantify uncertainty in an explanation.
Our results on several datasets show that our approach can help verify explanations from GNNExplainer.
arXiv Detail & Related papers (2024-01-05T08:14:51Z) - Formal Proofs as Structured Explanations: Proposing Several Tasks on Explainable Natural Language Inference [0.16317061277457]
We propose a reasoning framework that can model the reasoning process underlying natural language inferences.<n>The framework is based on the semantic tableau method, a well-studied proof system in formal logic.<n>We show how it can be used to define natural language reasoning tasks with structured explanations.
arXiv Detail & Related papers (2023-11-15T01:24:09Z) - Evaluating the Robustness of Interpretability Methods through
Explanation Invariance and Equivariance [72.50214227616728]
Interpretability methods are valuable only if their explanations faithfully describe the explained model.
We consider neural networks whose predictions are invariant under a specific symmetry group.
arXiv Detail & Related papers (2023-04-13T17:59:03Z) - A Theoretical Framework for AI Models Explainability with Application in
Biomedicine [3.5742391373143474]
We propose a novel definition of explanation that is a synthesis of what can be found in the literature.
We fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's inner workings and decision-making process) and plausibility (i.e., how much the explanation looks convincing to the user)
arXiv Detail & Related papers (2022-12-29T20:05:26Z) - MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure [129.8481568648651]
We propose a benchmark to investigate models' logical reasoning capabilities in complex real-life scenarios.
Based on the multi-hop chain of reasoning, the explanation form includes three main components.
We evaluate the current best models' performance on this new explanation form.
arXiv Detail & Related papers (2022-10-22T16:01:13Z) - NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning [59.16962123636579]
This paper proposes a new take on Prolog-based inference engines.
We replace handcrafted rules with a combination of neural language modeling, guided generation, and semi dense retrieval.
Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA.
arXiv Detail & Related papers (2022-09-16T00:54:44Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - Explanatory Paradigms in Neural Networks [18.32369721322249]
We present a leap-forward expansion to the study of explainability in neural networks by considering explanations as answers to reasoning-based questions.
The answers to these questions are observed correlations, observed counterfactuals, and observed contrastive explanations respectively.
The term observed refers to the specific case of post-hoc explainability, when an explanatory technique explains the decision $P$ after a trained neural network has made the decision $P$.
arXiv Detail & Related papers (2022-02-24T00:22:11Z) - Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial
Explanations of Their Behavior in Natural Language? [86.60613602337246]
We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL explanations.
LAS measures how well explanations help an observer predict a model's output, while controlling for how explanations can directly leak the output.
We frame explanation generation as a multi-agent game and optimize explanations for simulatability while penalizing label leakage.
arXiv Detail & Related papers (2020-10-08T16:59:07Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.