A Unified Framework for Evaluating the Effectiveness and Enhancing the Transparency of Explainable AI Methods in Real-World Applications
- URL: http://arxiv.org/abs/2412.03884v2
- Date: Tue, 15 Jul 2025 17:10:45 GMT
- Title: A Unified Framework for Evaluating the Effectiveness and Enhancing the Transparency of Explainable AI Methods in Real-World Applications
- Authors: Md. Ariful Islam, Md Abrar Jahin, M. F. Mridha, Nilanjan Dey,
- Abstract summary: This study introduces a single evaluation framework for XAI.<n>It uses both numbers and user feedback to check if the explanations are correct, easy to understand, fair, complete, and reliable.<n>We show the value of this framework through case studies in healthcare, finance, farming, and self-driving systems.
- Score: 2.0681376988193843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The fast growth of deep learning has brought great progress in AI-based applications. However, these models are often seen as "black boxes," which makes them hard to understand, explain, or trust. Explainable Artificial Intelligence (XAI) tries to make AI decisions clearer so that people can understand how and why the model makes certain choices. Even though many studies have focused on XAI, there is still a lack of standard ways to measure how well these explanation methods work in real-world situations. This study introduces a single evaluation framework for XAI. It uses both numbers and user feedback to check if the explanations are correct, easy to understand, fair, complete, and reliable. The framework focuses on users' needs and different application areas, which helps improve the trust and use of AI in important fields. To fix problems in current evaluation methods, we propose clear steps, including loading data, creating explanations, and fully testing them. We also suggest setting common benchmarks. We show the value of this framework through case studies in healthcare, finance, farming, and self-driving systems. These examples prove that our method can support fair and trustworthy evaluation of XAI methods. This work gives a clear and practical way to improve transparency and trust in AI systems used in the real world.
Related papers
- Evaluation Framework for AI Systems in "the Wild" [37.48117853114386]
Generative AI (GenAI) models have become vital across industries, yet current evaluation methods have not adapted to their widespread use.
Traditional evaluations often rely on benchmarks and fixed datasets, frequently failing to reflect real-world performance.
This white paper proposes a comprehensive framework for how we should evaluate real-world GenAI systems.
arXiv Detail & Related papers (2025-04-23T14:52:39Z) - A Multi-Layered Research Framework for Human-Centered AI: Defining the Path to Explainability and Trust [2.4578723416255754]
Human-Centered AI (HCAI) emphasizes alignment with human values, while Explainable AI (XAI) enhances transparency by making AI decisions more understandable.
This paper presents a novel three-layered framework that bridges HCAI and XAI to establish a structured explainability paradigm.
Our findings advance Human-Centered Explainable AI (HCXAI), fostering AI systems that are transparent, adaptable, and ethically aligned.
arXiv Detail & Related papers (2025-04-14T01:29:30Z) - General Scales Unlock AI Evaluation with Explanatory and Predictive Power [57.7995945974989]
benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems.<n>We introduce general scales for AI evaluation that can explain what common AI benchmarks really measure.<n>Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate.
arXiv Detail & Related papers (2025-03-09T01:13:56Z) - VirtualXAI: A User-Centric Framework for Explainability Assessment Leveraging GPT-Generated Personas [0.07499722271664146]
The demand for eXplainable AI (XAI) has increased to enhance the interpretability, transparency, and trustworthiness of AI models.
We propose a framework that integrates quantitative benchmarking with qualitative user assessments through virtual personas.
This yields an estimated XAI score and provides tailored recommendations for both the optimal AI model and the XAI method for a given scenario.
arXiv Detail & Related papers (2025-03-06T09:44:18Z) - A Survey on Human-Centered Evaluation of Explainable AI Methods in Clinical Decision Support Systems [45.89954090414204]
This paper provides a survey of human-centered evaluations of Explainable AI methods in Clinical Decision Support Systems.
Our findings reveal key challenges in the integration of XAI into healthcare and propose a structured framework to align the evaluation methods of XAI with the clinical needs of stakeholders.
arXiv Detail & Related papers (2025-02-14T01:21:29Z) - Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction [5.417632175667161]
Explainable Artificial Intelligence (XAI) addresses challenges by providing explanations for how these models make decisions and predictions.
Existing studies have examined the fundamental concepts of XAI, its general principles, and the scope of XAI techniques.
This paper provides a comprehensive literature review encompassing common terminologies and definitions, the need for XAI, beneficiaries of XAI, a taxonomy of XAI methods, and the application of XAI methods in different application areas.
arXiv Detail & Related papers (2024-08-30T21:42:17Z) - SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals [0.0]
This paper introduces SCENE (Soft Counterfactual Evaluation for Natural language Explainability), a novel evaluation method.
By focusing on token-based substitutions, SCENE creates contextually appropriate and semantically meaningful Soft Counterfactuals.
SCENE provides valuable insights into the strengths and limitations of various XAI techniques.
arXiv Detail & Related papers (2024-08-08T16:36:24Z) - Robustness of Explainable Artificial Intelligence in Industrial Process Modelling [43.388607981317016]
We evaluate current XAI methods by scoring them based on ground truth simulations and sensitivity analysis.
We show the differences between XAI methods in their ability to correctly predict the true sensitivity of the modeled industrial process.
arXiv Detail & Related papers (2024-07-12T09:46:26Z) - Explainable AI for Enhancing Efficiency of DL-based Channel Estimation [1.0136215038345013]
Support of artificial intelligence based decision-making is a key element in future 6G networks.
In such applications, using AI as black-box models is risky and challenging.
We propose a novel-based XAI-CHEST framework that is oriented toward channel estimation in wireless communications.
arXiv Detail & Related papers (2024-07-09T16:24:21Z) - Can you trust your explanations? A robustness test for feature attribution methods [42.36530107262305]
The field of Explainable AI (XAI) has seen a rapid growth but the usage of its techniques has at times led to unexpected results.
We will show how leveraging manifold hypothesis and ensemble approaches can be beneficial to an in-depth analysis of the robustness.
arXiv Detail & Related papers (2024-06-20T14:17:57Z) - Explainable AI for Safe and Trustworthy Autonomous Driving: A Systematic Review [12.38351931894004]
We present the first systematic literature review of explainable methods for safe and trustworthy autonomous driving.
We identify five key contributions of XAI for safe and trustworthy AI in AD, which are interpretable design, interpretable surrogate models, interpretable monitoring, auxiliary explanations, and interpretable validation.
We propose a modular framework called SafeX to integrate these contributions, enabling explanation delivery to users while simultaneously ensuring the safety of AI models.
arXiv Detail & Related papers (2024-02-08T09:08:44Z) - Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL)
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z) - Evaluating General-Purpose AI with Psychometrics [43.85432514910491]
We discuss the need for a comprehensive and accurate evaluation of general-purpose AI systems such as large language models.
Current evaluation methodology, mostly based on benchmarks of specific tasks, falls short of adequately assessing these versatile AI systems.
To tackle these challenges, we suggest transitioning from task-oriented evaluation to construct-oriented evaluation.
arXiv Detail & Related papers (2023-10-25T05:38:38Z) - Survey of Trustworthy AI: A Meta Decision of AI [0.41292255339309647]
Trusting an opaque system involves deciding on the level of Trustworthy AI (TAI)
To underpin these domains, we create ten dimensions to measure trust: explainability/transparency, fairness/diversity, generalizability, privacy, data governance, safety/robustness, accountability, reliability, and sustainability.
arXiv Detail & Related papers (2023-06-01T06:25:01Z) - The Meta-Evaluation Problem in Explainable AI: Identifying Reliable
Estimators with MetaQuantus [10.135749005469686]
One of the unsolved challenges in the field of Explainable AI (XAI) is determining how to most reliably estimate the quality of an explanation method.
We address this issue through a meta-evaluation of different quality estimators in XAI.
Our novel framework, MetaQuantus, analyses two complementary performance characteristics of a quality estimator.
arXiv Detail & Related papers (2023-02-14T18:59:02Z) - Towards Reconciling Usability and Usefulness of Explainable AI
Methodologies [2.715884199292287]
Black-box AI systems can lead to liability and accountability issues when they produce an incorrect decision.
Explainable AI (XAI) seeks to bridge the knowledge gap, between developers and end-users.
arXiv Detail & Related papers (2023-01-13T01:08:49Z) - Seamful XAI: Operationalizing Seamful Design in Explainable AI [59.89011292395202]
Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps.
We propose that seamful design can foster AI explainability by revealing sociotechnical and infrastructural mismatches.
We explore this process with 43 AI practitioners and real end-users.
arXiv Detail & Related papers (2022-11-12T21:54:05Z) - Towards Human Cognition Level-based Experiment Design for Counterfactual
Explanations (XAI) [68.8204255655161]
The emphasis of XAI research appears to have turned to a more pragmatic explanation approach for better understanding.
An extensive area where cognitive science research may substantially influence XAI advancements is evaluating user knowledge and feedback.
We propose a framework to experiment with generating and evaluating the explanations on the grounds of different cognitive levels of understanding.
arXiv Detail & Related papers (2022-10-31T19:20:22Z) - Responsibility: An Example-based Explainable AI approach via Training
Process Inspection [1.4610038284393165]
We present a novel XAI approach that identifies the most responsible training example for a particular decision.
This example can then be shown as an explanation: "this is what I (the AI) learned that led me to do that"
Our results demonstrate that responsibility can help improve accuracy for both human end users and secondary ML models.
arXiv Detail & Related papers (2022-09-07T19:30:01Z) - A User-Centred Framework for Explainable Artificial Intelligence in
Human-Robot Interaction [70.11080854486953]
We propose a user-centred framework for XAI that focuses on its social-interactive aspect.
The framework aims to provide a structure for interactive XAI solutions thought for non-expert users.
arXiv Detail & Related papers (2021-09-27T09:56:23Z) - Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
Counterfactual explanations aim to provide to end users a set of features that need to be changed in order to achieve a desired outcome.
Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations.
We present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations.
arXiv Detail & Related papers (2021-06-14T20:48:48Z) - An interdisciplinary conceptual study of Artificial Intelligence (AI)
for helping benefit-risk assessment practices: Towards a comprehensive
qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence.
The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z) - TrustyAI Explainability Toolkit [1.0499611180329804]
We will look at how TrustyAI can support trust in decision services and predictive models.
We investigate techniques such as LIME, SHAP and counterfactuals.
We also look into an extended version of SHAP, which supports background data selection to be evaluated.
arXiv Detail & Related papers (2021-04-26T17:00:32Z) - A Methodology for Creating AI FactSheets [67.65802440158753]
This paper describes a methodology for creating the form of AI documentation we call FactSheets.
Within each step of the methodology, we describe the issues to consider and the questions to explore.
This methodology will accelerate the broader adoption of transparent AI documentation.
arXiv Detail & Related papers (2020-06-24T15:08:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.