Logic Explanation of AI Classifiers by Categorical Explaining Functors
- URL: http://arxiv.org/abs/2503.16203v1
- Date: Thu, 20 Mar 2025 14:50:06 GMT
- Title: Logic Explanation of AI Classifiers by Categorical Explaining Functors
- Authors: Stefano Fioravanti, Francesco Giannini, Paolo Frazzetto, Fabio Zanasi, Pietro Barbiero,
- Abstract summary: We propose a theoretically grounded approach to ensure coherence and fidelity of extracted explanations.<n>As a proof of concept, we validate the proposed theoretical constructions on a synthetic benchmark.
- Score: 5.311276815905217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The most common methods in explainable artificial intelligence are post-hoc techniques which identify the most relevant features used by pretrained opaque models. Some of the most advanced post hoc methods can generate explanations that account for the mutual interactions of input features in the form of logic rules. However, these methods frequently fail to guarantee the consistency of the extracted explanations with the model's underlying reasoning. To bridge this gap, we propose a theoretically grounded approach to ensure coherence and fidelity of the extracted explanations, moving beyond the limitations of current heuristic-based approaches. To this end, drawing from category theory, we introduce an explaining functor which structurally preserves logical entailment between the explanation and the opaque model's reasoning. As a proof of concept, we validate the proposed theoretical constructions on a synthetic benchmark verifying how the proposed approach significantly mitigates the generation of contradictory or unfaithful explanations.
Related papers
- Dialogue-based Explanations for Logical Reasoning using Structured Argumentation [0.06138671548064355]
We identify structural weaknesses of the state-of-the-art and propose a generic argumentation-based approach to address these problems.
Our work provides dialogue models as dialectic-proof procedures to compute and explain a query answer.
This allows us to construct dialectical proof trees as explanations, which are more expressive and arguably more intuitive than existing explanation formalisms.
arXiv Detail & Related papers (2025-02-16T22:26:18Z) - DiConStruct: Causal Concept-based Explanations through Black-Box
Distillation [9.735426765564474]
We present DiConStruct, an explanation method that is both concept-based and causal.
Our explainer works as a distillation model to any black-box machine learning model by approximating its predictions while producing the respective explanations.
arXiv Detail & Related papers (2024-01-16T17:54:02Z) - An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity.
We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z) - Abductive Commonsense Reasoning Exploiting Mutually Exclusive
Explanations [118.0818807474809]
Abductive reasoning aims to find plausible explanations for an event.
Existing approaches for abductive reasoning in natural language processing often rely on manually generated annotations for supervision.
This work proposes an approach for abductive commonsense reasoning that exploits the fact that only a subset of explanations is correct for a given context.
arXiv Detail & Related papers (2023-05-24T01:35:10Z) - MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure [129.8481568648651]
We propose a benchmark to investigate models' logical reasoning capabilities in complex real-life scenarios.
Based on the multi-hop chain of reasoning, the explanation form includes three main components.
We evaluate the current best models' performance on this new explanation form.
arXiv Detail & Related papers (2022-10-22T16:01:13Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - A Formalisation of Abstract Argumentation in Higher-Order Logic [77.34726150561087]
We present an approach for representing abstract argumentation frameworks based on an encoding into classical higher-order logic.
This provides a uniform framework for computer-assisted assessment of abstract argumentation frameworks using interactive and automated reasoning tools.
arXiv Detail & Related papers (2021-10-18T10:45:59Z) - Entropy-based Logic Explanations of Neural Networks [24.43410365335306]
We propose an end-to-end differentiable approach for extracting logic explanations from neural networks.
The method relies on an entropy-based criterion which automatically identifies the most relevant concepts.
We consider four different case studies to demonstrate that: (i) this entropy-based criterion enables the distillation of concise logic explanations in safety-critical domains from clinical data to computer vision; (ii) the proposed approach outperforms state-of-the-art white-box models in terms of classification accuracy.
arXiv Detail & Related papers (2021-06-12T15:50:47Z) - Generating Commonsense Explanation by Extracting Bridge Concepts from
Reasoning Paths [128.13034600968257]
We propose a method that first extracts the underlying concepts which are served as textitbridges in the reasoning chain.
To facilitate the reasoning process, we utilize external commonsense knowledge to build the connection between a statement and the bridge concepts.
We design a bridge concept extraction model that first scores the triples, routes the paths in the subgraph, and further selects bridge concepts with weak supervision.
arXiv Detail & Related papers (2020-09-24T15:27:20Z) - Plausible Reasoning about EL-Ontologies using Concept Interpolation [27.314325986689752]
We propose an inductive mechanism which is based on a clear model-theoretic semantics, and can thus be tightly integrated with standard deductive reasoning.
We focus on inference, a powerful commonsense reasoning mechanism which is closely related to cognitive models of category-based induction.
arXiv Detail & Related papers (2020-06-25T14:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.