Exploring SAIG Methods for an Objective Evaluation of XAI
- URL: http://arxiv.org/abs/2602.08715v1
- Date: Mon, 09 Feb 2026 14:24:46 GMT
- Title: Exploring SAIG Methods for an Objective Evaluation of XAI
- Authors: Miquel Miró-Nicolau, Gabriel Moyà-Alcover, Anna Arias-Duart,
- Abstract summary: This paper presents the first review and analysis of Synthetic Artificial Intelligence Ground truth (SAIG) methods.<n>We introduce a novel taxonomy to classify these approaches, identifying seven key features that distinguish different SAIG methods.<n>Our comparative study reveals a concerning lack of consensus on the most effective XAI evaluation techniques.
- Score: 3.2935489377782705
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The evaluation of eXplainable Artificial Intelligence (XAI) methods is a rapidly growing field, characterized by a wide variety of approaches. This diversity highlights the complexity of the XAI evaluation, which, unlike traditional AI assessment, lacks a universally correct ground truth for the explanation, making objective evaluation challenging. One promising direction to address this issue involves the use of what we term Synthetic Artificial Intelligence Ground truth (SAIG) methods, which generate artificial ground truths to enable the direct evaluation of XAI techniques. This paper presents the first review and analysis of SAIG methods. We introduce a novel taxonomy to classify these approaches, identifying seven key features that distinguish different SAIG methods. Our comparative study reveals a concerning lack of consensus on the most effective XAI evaluation techniques, underscoring the need for further research and standardization in this area.
Related papers
- Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions [95.59915390053588]
This study focuses on Explainable Artificial Intelligence (XAI) approaches-focusing on deep neural networks (DNNs) and large language models (LLMs)<n>We discuss critical symptoms that stem from deeper root causes (i.e., two paradoxes, two conceptual confusions, and five false assumptions)<n>To move beyond XAI's limitations, we propose a four-pronged paradigm shift toward reliable and certified AI development.
arXiv Detail & Related papers (2026-02-27T16:58:27Z) - The next question after Turing's question: Introducing the Grow-AI test [51.56484100374058]
This study aims to extend the framework for assessing artificial intelligence, called GROW-AI.<n>GROW-AI is designed to answer the question "Can machines grow up?" -- a natural successor to the Turing Test.<n>The originality of the work lies in the conceptual transposition of the process of "growing" from the human world to that of artificial intelligence.
arXiv Detail & Related papers (2025-08-22T10:19:42Z) - Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics [10.045644410833402]
We introduce LATEC, a large-scale benchmark that critically evaluates 17 prominent XAI methods using 20 distinct metrics.<n>We showcase the high risk of conflicting metrics leading to unreliable rankings and consequently propose a more robust evaluation scheme.<n>LATEC reinforces its role in future XAI research by publicly releasing all 326k saliency maps and 378k metric scores as a (meta-evaluation) dataset.
arXiv Detail & Related papers (2024-09-25T09:07:46Z) - OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning [43.87507227859493]
This paper presents OpenHEXAI, an open-source framework for human-centered evaluation of XAI methods.
OpenHEAXI is the first large-scale infrastructural effort to facilitate human-centered benchmarks of XAI methods.
arXiv Detail & Related papers (2024-02-20T22:17:59Z) - SIDU-TXT: An XAI Algorithm for NLP with a Holistic Assessment Approach [14.928572140620245]
'Similarity Difference and Uniqueness' (SIDU) XAI method, recognized for its superior capability in localizing entire salient regions in image-based classification is extended to textual data.
The extended method, SIDU-TXT, utilizes feature activation maps from 'black-box' models to generate heatmaps at a granular, word-based level.
We find that, in sentiment analysis task of a movie review dataset, SIDU-TXT excels in both functionally and human-grounded evaluations.
arXiv Detail & Related papers (2024-02-05T14:29:54Z) - How much informative is your XAI? A decision-making assessment task to
objectively measure the goodness of explanations [53.01494092422942]
The number and complexity of personalised and user-centred approaches to XAI have rapidly grown in recent years.
It emerged that user-centred approaches to XAI positively affect the interaction between users and systems.
We propose an assessment task to objectively and quantitatively measure the goodness of XAI systems.
arXiv Detail & Related papers (2023-12-07T15:49:39Z) - Assessing Fidelity in XAI post-hoc techniques: A Comparative Study with
Ground Truth Explanations Datasets [0.0]
XAI methods based on the backpropagation of output information to input yield higher accuracy and reliability.
Backpropagation method tends to generate more noisy saliency maps.
Findings have significant implications for the advancement of XAI methods.
arXiv Detail & Related papers (2023-11-03T14:57:24Z) - An Experimental Investigation into the Evaluation of Explainability
Methods [60.54170260771932]
This work compares 14 different metrics when applied to nine state-of-the-art XAI methods and three dummy methods (e.g., random saliency maps) used as references.
Experimental results show which of these metrics produces highly correlated results, indicating potential redundancy.
arXiv Detail & Related papers (2023-05-25T08:07:07Z) - Towards Human Cognition Level-based Experiment Design for Counterfactual
Explanations (XAI) [68.8204255655161]
The emphasis of XAI research appears to have turned to a more pragmatic explanation approach for better understanding.
An extensive area where cognitive science research may substantially influence XAI advancements is evaluating user knowledge and feedback.
We propose a framework to experiment with generating and evaluating the explanations on the grounds of different cognitive levels of understanding.
arXiv Detail & Related papers (2022-10-31T19:20:22Z) - Connecting Algorithmic Research and Usage Contexts: A Perspective of
Contextualized Evaluation for Explainable AI [65.44737844681256]
A lack of consensus on how to evaluate explainable AI (XAI) hinders the advancement of the field.
We argue that one way to close the gap is to develop evaluation methods that account for different user requirements.
arXiv Detail & Related papers (2022-06-22T05:17:33Z) - Data Representing Ground-Truth Explanations to Evaluate XAI Methods [0.0]
Explainable artificial intelligence (XAI) methods are currently evaluated with approaches mostly originated in interpretable machine learning (IML) research.
We propose to represent explanations with canonical equations that can be used to evaluate the accuracy of XAI methods.
arXiv Detail & Related papers (2020-11-18T16:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.