Related papers: Agentic Explainable Artificial Intelligence (Agentic XAI) Approach To Explore Better Explanation

Agentic Explainable Artificial Intelligence (Agentic XAI) Approach To Explore Better Explanation

URL: http://arxiv.org/abs/2512.21066v1
Date: Wed, 24 Dec 2025 09:19:15 GMT
Title: Agentic Explainable Artificial Intelligence (Agentic XAI) Approach To Explore Better Explanation
Authors: Tomoaki Yamaguchi, Yutong Zhou, Masahiro Ryo, Keisuke Katsura,
Abstract summary: This study proposes an agentic XAI framework combining SHAP-based explainability with multimodal LLM-driven iterative refinement.<n>We tested this framework as an agricultural recommendation system using rice yield data from 26 fields in Japan.
Score: 7.268064183717186
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Explainable artificial intelligence (XAI) enables data-driven understanding of factor associations with response variables, yet communicating XAI outputs to laypersons remains challenging, hindering trust in AI-based predictions. Large language models (LLMs) have emerged as promising tools for translating technical explanations into accessible narratives, yet the integration of agentic AI, where LLMs operate as autonomous agents through iterative refinement, with XAI remains unexplored. This study proposes an agentic XAI framework combining SHAP-based explainability with multimodal LLM-driven iterative refinement to generate progressively enhanced explanations. As a use case, we tested this framework as an agricultural recommendation system using rice yield data from 26 fields in Japan. The Agentic XAI initially provided a SHAP result and explored how to improve the explanation through additional analysis iteratively across 11 refinement rounds (Rounds 0-10). Explanations were evaluated by human experts (crop scientists) (n=12) and LLMs (n=14) against seven metrics: Specificity, Clarity, Conciseness, Practicality, Contextual Relevance, Cost Consideration, and Crop Science Credibility. Both evaluator groups confirmed that the framework successfully enhanced recommendation quality with an average score increase of 30-33% from Round 0, peaking at Rounds 3-4. However, excessive refinement showed a substantial drop in recommendation quality, indicating a bias-variance trade-off where early rounds lacked explanation depth (bias) while excessive iteration introduced verbosity and ungrounded abstraction (variance), as revealed by metric-specific analysis. These findings suggest that strategic early stopping (regularization) is needed for optimizing practical utility, challenging assumptions about monotonic improvement and providing evidence-based design principles for agentic XAI systems.

Related papers

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification [71.98473277917962]
Recent advances in Deep Research Agents (DRAs) are transforming automated knowledge discovery and problem-solving.<n>We propose an alternative paradigm: self-evolving the agent's ability by iteratively verifying the policy model's outputs, guided by meticulously crafted rubrics.<n>We present DeepVerifier, a rubrics-based outcome reward verifier that leverages the asymmetry of verification.
arXiv Detail & Related papers (2026-01-22T09:47:31Z)
Repurposing Synthetic Data for Fine-grained Search Agent Supervision [81.95597592711688]
LLM-based search agents are increasingly trained on entity-centric synthetic data.<n> prevailing training methods discard this rich entity information, relying instead on sparse, outcome-based rewards.<n>We introduce Entity-aware Group Relative Policy Optimization (E-GRPO), a novel framework that formulates a dense entity-aware reward function.
arXiv Detail & Related papers (2025-10-28T17:50:40Z)
Towards Transparent AI: A Survey on Explainable Language Models [22.70051215800476]
Language Models (LMs) have significantly advanced natural language processing and enabled remarkable progress across diverse domains.<n>Lack of transparency is particularly problematic for adoption in high-stakes domains.<n>XAI methods have been well studied for non-LMs, but they face many limitations when applied to LMs.
arXiv Detail & Related papers (2025-09-25T21:47:39Z)
Eigen-1: Adaptive Multi-Agent Refinement with Monitor-Based RAG for Scientific Reasoning [53.45095336430027]
We develop a unified framework that combines implicit retrieval and structured collaboration.<n>On Humanity's Last Exam (HLE) Bio/Chem Gold, our framework achieves 48.3% accuracy.<n>Results on SuperGPQA and TRQA confirm robustness across domains.
arXiv Detail & Related papers (2025-09-25T14:05:55Z)
STARec: An Efficient Agent Framework for Recommender Systems via Autonomous Deliberate Reasoning [54.28691219536054]
We introduce STARec, a slow-thinking augmented agent framework that endows recommender systems with autonomous deliberative reasoning capabilities.<n>We develop anchored reinforcement training - a two-stage paradigm combining structured knowledge distillation from advanced reasoning models with preference-aligned reward shaping.<n>Experiments on MovieLens 1M and Amazon CDs benchmarks demonstrate that STARec achieves substantial performance gains compared with state-of-the-art baselines.
arXiv Detail & Related papers (2025-08-26T08:47:58Z)
AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models [78.08374249341514]
The rapid development of AI-generated content (AIGC) has led to the misuse of AI-generated images (AIGI) in spreading misinformation.<n>We introduce a large-scale and comprehensive dataset, Holmes-Set, which includes an instruction-tuning dataset with explanations on whether images are AI-generated.<n>Our work introduces an efficient data annotation method called the Multi-Expert Jury, enhancing data generation through structured MLLM explanations and quality control.<n>In addition, we propose Holmes Pipeline, a meticulously designed three-stage training framework comprising visual expert pre-training, supervised fine-tuning, and direct preference optimization
arXiv Detail & Related papers (2025-07-03T14:26:31Z)
Unifying VXAI: A Systematic Review and Framework for the Evaluation of Explainable AI [4.715895520943978]
Explainable AI (XAI) addresses this issue by providing human-understandable explanations of model behavior.<n>Despite the growing number of XAI methods, the field lacks standardized evaluation protocols and consensus on appropriate metrics.<n>We introduce a unified framework for the eValuation of XAI (VXAI)
arXiv Detail & Related papers (2025-06-18T12:25:37Z)
Mind the XAI Gap: A Human-Centered LLM Framework for Democratizing Explainable AI [3.301842921686179]
We introduce a framework that ensures transparency and human-centered explanations tailored to the needs of experts and non-experts.<n>Our framework encapsulates in one response explanations understandable by non-experts and technical information to experts.
arXiv Detail & Related papers (2025-06-13T21:41:07Z)
Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG [51.120170062795566]
We propose Divide-Then-Align (DTA) to endow RAG systems with the ability to respond with "I don't know" when the query is out of the knowledge boundary.<n>DTA balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-27T08:21:21Z)
Too Much to Trust? Measuring the Security and Cognitive Impacts of Explainability in AI-Driven SOCs [0.6990493129893112]
Explainable AI (XAI) holds significant promise for enhancing the transparency and trustworthiness of AI-driven threat detection.<n>This study re-evaluates current explanation methods within security contexts and demonstrates that role-aware, context-rich XAI designs aligned with SOC can substantially improve practical utility.
arXiv Detail & Related papers (2025-03-03T21:39:15Z)
Learning to Generate and Evaluate Fact-checking Explanations with Transformers [10.970249299147866]
Research contributes to the field of Explainable Artificial Antelligence (XAI) We develop transformer-based fact-checking models that contextualise and justify their decisions by generating human-accessible explanations. We emphasise the need for aligning Artificial Intelligence (AI)-generated explanations with human judgements.
arXiv Detail & Related papers (2024-10-21T06:22:51Z)
The Meta-Evaluation Problem in Explainable AI: Identifying Reliable Estimators with MetaQuantus [10.135749005469686]
One of the unsolved challenges in the field of Explainable AI (XAI) is determining how to most reliably estimate the quality of an explanation method. We address this issue through a meta-evaluation of different quality estimators in XAI. Our novel framework, MetaQuantus, analyses two complementary performance characteristics of a quality estimator.
arXiv Detail & Related papers (2023-02-14T18:59:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.