Related papers: Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations

Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations

URL: http://arxiv.org/abs/2504.12424v1
Date: Wed, 16 Apr 2025 18:45:18 GMT
Title: Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations
Authors: Ashley Suh, Kenneth Alperin, Harry Li, Steven R Gomez,
Abstract summary: Large Language Models (LLMs) are used to translate outputs from explainability techniques, like feature-attribution weights, into a natural language explanation.<n>Recent findings suggest translating into human-like explanations does not necessarily enhance user understanding and may instead lead to overreliance on AI systems.
Score: 1.6855625805565164
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This position paper highlights a growing trend in Explainable AI (XAI) research where Large Language Models (LLMs) are used to translate outputs from explainability techniques, like feature-attribution weights, into a natural language explanation. While this approach may improve accessibility or readability for users, recent findings suggest that translating into human-like explanations does not necessarily enhance user understanding and may instead lead to overreliance on AI systems. When LLMs summarize XAI outputs without surfacing model limitations, uncertainties, or inconsistencies, they risk reinforcing the illusion of interpretability rather than fostering meaningful transparency. We argue that - instead of merely translating XAI outputs - LLMs should serve as constructive agitators, or devil's advocates, whose role is to actively interrogate AI explanations by presenting alternative interpretations, potential biases, training data limitations, and cases where the model's reasoning may break down. In this role, LLMs can facilitate users in engaging critically with AI systems and generated explanations, with the potential to reduce overreliance caused by misinterpreted or specious explanations.

Related papers

Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs [100.02824137397464]
We investigate how Large Language Models adapt their internal representations when encountering inputs of increasing difficulty.<n>We reveal a consistent and quantifiable phenomenon: as task difficulty increases, the last hidden states of LLMs become substantially sparser.<n>This sparsity--difficulty relation is observable across diverse models and domains.
arXiv Detail & Related papers (2026-03-03T18:48:15Z)
Towards Transparent AI: A Survey on Explainable Large Language Models [2.443957114877221]
Large Language Models (LLMs) have played a pivotal role in advancing Artificial Intelligence (AI)<n>LLMs often struggle to explain their decision-making processes, making them a 'black box' and presenting a substantial challenge to explainability.<n>To overcome these limitations, researchers have developed various explainable artificial intelligence (XAI) methods.
arXiv Detail & Related papers (2025-06-26T23:25:22Z)
LLMs for Explainable AI: A Comprehensive Survey [0.7373617024876725]
Large Language Models (LLMs) offer a promising approach to enhancing Explainable AI (XAI)<n>LLMs transform complex machine learning outputs into easy-to-understand narratives.<n>LLMs can bridge the gap between sophisticated model behavior and human interpretability.
arXiv Detail & Related papers (2025-03-31T18:19:41Z)
Explainable artificial intelligence (XAI): from inherent explainability to large language models [0.0]
Explainable AI (XAI) techniques facilitate the explainability or interpretability of machine learning models.<n>This paper details the advancements of explainable AI methods, from inherently interpretable models to modern approaches.<n>We review explainable AI techniques that leverage vision-language model (VLM) frameworks to automate or improve the explainability of other machine learning models.
arXiv Detail & Related papers (2025-01-17T06:16:57Z)
LLMs for XAI: Future Directions for Explaining Explanations [50.87311607612179]
We focus on refining explanations computed using existing XAI algorithms. Initial experiments and user study suggest that LLMs offer a promising way to enhance the interpretability and usability of XAI.
arXiv Detail & Related papers (2024-05-09T19:17:47Z)
Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era [77.174117675196]
XAI is being extended towards Large Language Models (LLMs) This paper analyzes how XAI can benefit LLMs and AI systems. We introduce 10 strategies, introducing the key techniques for each and discussing their associated challenges.
arXiv Detail & Related papers (2024-03-13T20:25:27Z)
FaithLM: Towards Faithful Explanations for Large Language Models [67.29893340289779]
Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their internal knowledge and reasoning capabilities. The black-box nature of these models complicates the task of explaining their decision-making processes. We introduce FaithLM to explain the decision of LLMs with natural language (NL) explanations.
arXiv Detail & Related papers (2024-02-07T09:09:14Z)
Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks. The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human. These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z)
Towards Effective Disambiguation for Machine Translation with Large Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences" Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z)
LMExplainer: Grounding Knowledge and Explaining Language Models [37.578973458651944]
Language models (LMs) like GPT-4 are important in AI applications, but their opaque decision-making process reduces user trust, especially in safety-critical areas. We introduce LMExplainer, a novel knowledge-grounded explainer that clarifies the reasoning process of LMs through intuitive, human-understandable explanations.
arXiv Detail & Related papers (2023-03-29T08:59:44Z)
Making Things Explainable vs Explaining: Requirements and Challenges under the GDPR [2.578242050187029]
ExplanatorY AI (YAI) builds over XAI with the goal to collect and organize explainable information. We represent the problem of generating explanations for Automated Decision-Making systems (ADMs) into the identification of an appropriate path over an explanatory space.
arXiv Detail & Related papers (2021-10-02T08:48:47Z)
Explainable AI without Interpretable Model [0.0]
It has become more important than ever that AI systems would be able to explain the reasoning behind their results to end-users. Most Explainable AI (XAI) methods are based on extracting an interpretable model that can be used for producing explanations. The notions of Contextual Importance and Utility (CIU) presented in this paper make it possible to produce human-like explanations of black-box outcomes directly.
arXiv Detail & Related papers (2020-09-29T13:29:44Z)
Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL) In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.