A Quantitative Evaluation Framework for Explainable AI in Semantic Segmentation
- URL: http://arxiv.org/abs/2510.24414v3
- Date: Tue, 04 Nov 2025 11:54:57 GMT
- Title: A Quantitative Evaluation Framework for Explainable AI in Semantic Segmentation
- Authors: Reem Hammoud, Abdul Karim Gizzini, Ali J. Ghandour,
- Abstract summary: This work introduces a comprehensive quantitative evaluation framework for assessing explainable AI (XAI) in semantic segmentation.<n>The framework integrates pixel-level evaluation strategies with carefully designed metrics to yield fine-grained interpretability insights.<n>These findings advance the development of transparent, trustworthy, and accountable semantic segmentation models.
- Score: 0.20999222360659606
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Ensuring transparency and trust in artificial intelligence (AI) models is essential as they are increasingly deployed in safety-critical and high-stakes domains. Explainable AI (XAI) has emerged as a promising approach to address this challenge; however, the rigorous evaluation of XAI methods remains vital for balancing the trade-offs between model complexity, predictive performance, and interpretability. While substantial progress has been made in evaluating XAI for classification tasks, strategies tailored to semantic segmentation remain limited. Moreover, objectively assessing XAI approaches is difficult, since qualitative visual explanations provide only preliminary insights. Such qualitative methods are inherently subjective and cannot ensure the accuracy or stability of explanations. To address these limitations, this work introduces a comprehensive quantitative evaluation framework for assessing XAI in semantic segmentation, accounting for both spatial and contextual task complexities. The framework systematically integrates pixel-level evaluation strategies with carefully designed metrics to yield fine-grained interpretability insights. Simulation results using recently adapted class activation mapping (CAM)-based XAI schemes demonstrate the efficiency, robustness, and reliability of the proposed methodology. These findings advance the development of transparent, trustworthy, and accountable semantic segmentation models.
Related papers
- Detecting Cybersecurity Threats by Integrating Explainable AI with SHAP Interpretability and Strategic Data Sampling [0.0]
The framework addresses three fundamental challenges in deploying AI for threat detection.<n>Our approach maintains detection efficacy while reducing computational overhead.<n>It provides a robust foundation for deploying trustworthy AI systems in security operations centers.
arXiv Detail & Related papers (2026-02-22T08:01:14Z) - Novel Category Discovery with X-Agent Attention for Open-Vocabulary Semantic Segmentation [48.806000388608005]
We propose X-Agent, an innovative OVSS framework employing latent semantic-aware agent'' to orchestrate cross-modal attention mechanisms.<n>X-Agent achieves state-of-the-art performance while effectively enhancing the latent semantic saliency.
arXiv Detail & Related papers (2025-09-01T09:01:58Z) - Unifying VXAI: A Systematic Review and Framework for the Evaluation of Explainable AI [4.715895520943978]
Explainable AI (XAI) addresses this issue by providing human-understandable explanations of model behavior.<n>Despite the growing number of XAI methods, the field lacks standardized evaluation protocols and consensus on appropriate metrics.<n>We introduce a unified framework for the eValuation of XAI (VXAI)
arXiv Detail & Related papers (2025-06-18T12:25:37Z) - Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization [49.567092222782435]
We introduce the novel concept of XAI consistency, defined as the agreement among different feature attribution methods.<n>We create a multi-objective optimization framework that balances predictive performance with explanation.<n>Our research provides a foundation for future investigations into whether models from the trade-off zone-balancing performance loss and XAI consistency-exhibit greater robustness.
arXiv Detail & Related papers (2025-05-12T13:19:14Z) - FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics [66.14786900470158]
We propose FakeScope, an expert multimodal model (LMM) tailored for AI-generated image forensics.<n>FakeScope identifies AI-synthetic images with high accuracy and provides rich, interpretable, and query-driven forensic insights.<n>FakeScope achieves state-of-the-art performance in both closed-ended and open-ended forensic scenarios.
arXiv Detail & Related papers (2025-03-31T16:12:48Z) - VirtualXAI: A User-Centric Framework for Explainability Assessment Leveraging GPT-Generated Personas [0.07499722271664146]
The demand for eXplainable AI (XAI) has increased to enhance the interpretability, transparency, and trustworthiness of AI models.<n>We propose a framework that integrates quantitative benchmarking with qualitative user assessments through virtual personas.<n>This yields an estimated XAI score and provides tailored recommendations for both the optimal AI model and the XAI method for a given scenario.
arXiv Detail & Related papers (2025-03-06T09:44:18Z) - SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection [70.23196257213829]
We propose a scalable and reliable Semantic-level Evaluation framework for Open domain Event detection.<n>Our proposed framework first constructs a scalable evaluation benchmark that currently includes 564 event types covering 7 major domains.<n>We then leverage large language models (LLMs) as automatic evaluation agents to compute a semantic F1-score, incorporating fine-grained definitions of semantically similar labels.
arXiv Detail & Related papers (2025-03-05T09:37:05Z) - A Unified Framework for Evaluating the Effectiveness and Enhancing the Transparency of Explainable AI Methods in Real-World Applications [2.0681376988193843]
This study introduces a single evaluation framework for XAI.<n>It uses both numbers and user feedback to check if the explanations are correct, easy to understand, fair, complete, and reliable.<n>We show the value of this framework through case studies in healthcare, finance, farming, and self-driving systems.
arXiv Detail & Related papers (2024-12-05T05:30:10Z) - SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals [0.0]
This paper introduces SCENE (Soft Counterfactual Evaluation for Natural language Explainability), a novel evaluation method.
By focusing on token-based substitutions, SCENE creates contextually appropriate and semantically meaningful Soft Counterfactuals.
SCENE provides valuable insights into the strengths and limitations of various XAI techniques.
arXiv Detail & Related papers (2024-08-08T16:36:24Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - Extending CAM-based XAI methods for Remote Sensing Imagery Segmentation [7.735470452949379]
We introduce a new XAI evaluation methodology and metric based on "Entropy" to measure the model uncertainty.
We show that using Entropy to monitor the model uncertainty in segmenting the pixels within the target class is more suitable.
arXiv Detail & Related papers (2023-10-03T07:01:23Z) - REX: Rapid Exploration and eXploitation for AI Agents [103.68453326880456]
We propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.
REX introduces an additional layer of rewards and integrates concepts similar to Upper Confidence Bound (UCB) scores, leading to more robust and efficient AI agent performance.
arXiv Detail & Related papers (2023-07-18T04:26:33Z) - The Meta-Evaluation Problem in Explainable AI: Identifying Reliable
Estimators with MetaQuantus [10.135749005469686]
One of the unsolved challenges in the field of Explainable AI (XAI) is determining how to most reliably estimate the quality of an explanation method.
We address this issue through a meta-evaluation of different quality estimators in XAI.
Our novel framework, MetaQuantus, analyses two complementary performance characteristics of a quality estimator.
arXiv Detail & Related papers (2023-02-14T18:59:02Z) - Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
Counterfactual explanations aim to provide to end users a set of features that need to be changed in order to achieve a desired outcome.
Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations.
We present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations.
arXiv Detail & Related papers (2021-06-14T20:48:48Z) - Uncertainty as a Form of Transparency: Measuring, Communicating, and
Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions.
We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems.
This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.