Enhancing XAI Narratives through Multi-Narrative Refinement and Knowledge Distillation
- URL: http://arxiv.org/abs/2510.03134v2
- Date: Mon, 13 Oct 2025 13:50:02 GMT
- Title: Enhancing XAI Narratives through Multi-Narrative Refinement and Knowledge Distillation
- Authors: Flavio Giorgi, Matteo Silvestri, Cesare Campagnano, Fabrizio Silvestri, Gabriele Tolomei,
- Abstract summary: Counterfactual explanations offer insights into model behavior by highlighting minimal changes that would alter a prediction.<n>Despite their potential, these explanations are often complex and technical, making them difficult for non-experts to interpret.<n>We propose a novel pipeline that leverages Language Models, large and small, to compose narratives for counterfactual explanations.
- Score: 13.523610021268363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Explainable Artificial Intelligence has become a crucial area of research, aiming to demystify the decision-making processes of deep learning models. Among various explainability techniques, counterfactual explanations have been proven particularly promising, as they offer insights into model behavior by highlighting minimal changes that would alter a prediction. Despite their potential, these explanations are often complex and technical, making them difficult for non-experts to interpret. To address this challenge, we propose a novel pipeline that leverages Language Models, large and small, to compose narratives for counterfactual explanations. We employ knowledge distillation techniques along with a refining mechanism to enable Small Language Models to perform comparably to their larger counterparts while maintaining robust reasoning abilities. In addition, we introduce a simple but effective evaluation method to assess natural language narratives, designed to verify whether the models' responses are in line with the factual, counterfactual ground truth. As a result, our proposed pipeline enhances both the reasoning capabilities and practical performance of student models, making them more suitable for real-world use cases.
Related papers
- Counterfactual Training: Teaching Models Plausible and Actionable Explanations [52.967743166658984]
We propose a novel training regime termed counterfactual training to increase the explanatory capacity of models.<n>Counterfactual explanations have emerged as a popular post-hoc explanation method for opaque machine learning models.
arXiv Detail & Related papers (2026-01-22T18:56:14Z) - Exploring Energy Landscapes for Minimal Counterfactual Explanations: Applications in Cybersecurity and Beyond [3.6963146054309597]
Counterfactual explanations have emerged as a prominent method in Explainable Artificial Intelligence (XAI)<n>We present a novel framework that integrates perturbation theory and statistical mechanics to generate minimal counterfactual explanations.<n>Our approach systematically identifies the smallest modifications required to change a model's prediction while maintaining plausibility.
arXiv Detail & Related papers (2025-03-23T19:48:37Z) - Explainable artificial intelligence (XAI): from inherent explainability to large language models [0.0]
Explainable AI (XAI) techniques facilitate the explainability or interpretability of machine learning models.<n>This paper details the advancements of explainable AI methods, from inherently interpretable models to modern approaches.<n>We review explainable AI techniques that leverage vision-language model (VLM) frameworks to automate or improve the explainability of other machine learning models.
arXiv Detail & Related papers (2025-01-17T06:16:57Z) - Natural Language Counterfactual Explanations for Graphs Using Large Language Models [7.560731917128082]
We exploit the power of open-source Large Language Models to generate natural language explanations.<n>We show that our approach effectively produces accurate natural language representations of counterfactual instances.
arXiv Detail & Related papers (2024-10-11T23:06:07Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Probing via Prompting [71.7904179689271]
This paper introduces a novel model-free approach to probing, by formulating probing as a prompting task.
We conduct experiments on five probing tasks and show that our approach is comparable or better at extracting information than diagnostic probes.
We then examine the usefulness of a specific linguistic property for pre-training by removing the heads that are essential to that property and evaluating the resulting model's performance on language modeling.
arXiv Detail & Related papers (2022-07-04T22:14:40Z) - Towards Interpretable Deep Reinforcement Learning Models via Inverse
Reinforcement Learning [27.841725567976315]
We propose a novel framework utilizing Adversarial Inverse Reinforcement Learning.
This framework provides global explanations for decisions made by a Reinforcement Learning model.
We capture intuitive tendencies that the model follows by summarizing the model's decision-making process.
arXiv Detail & Related papers (2022-03-30T17:01:59Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control.
We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.
Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.