Related papers: Aligning Explanations with Human Communication

Aligning Explanations with Human Communication

URL: http://arxiv.org/abs/2505.15626v1
Date: Wed, 21 May 2025 15:14:05 GMT
Title: Aligning Explanations with Human Communication
Authors: Jacopo Teneggi, Zhenzhen Wang, Paul H. Yi, Tianmin Shu, Jeremias Sulam,
Abstract summary: We propose an iterative procedure grounded in principles of pragmatic reasoning and the rational speech act to generate explanations that maximize communicative utility.<n>We evaluate our method in image classification tasks, demonstrating improved alignment between explanations and listener preferences across three datasets.
Score: 16.285213687701187
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning explainability aims to make the decision-making process of black-box models more transparent by finding the most important input features for a given prediction task. Recent works have proposed composing explanations from semantic concepts (e.g., colors, patterns, shapes) that are inherently interpretable to the user of a model. However, these methods generally ignore the communicative context of explanation-the ability of the user to understand the prediction of the model from the explanation. For example, while a medical doctor might understand an explanation in terms of clinical markers, a patient may need a more accessible explanation to make sense of the same diagnosis. In this paper, we address this gap with listener-adaptive explanations. We propose an iterative procedure grounded in principles of pragmatic reasoning and the rational speech act to generate explanations that maximize communicative utility. Our procedure only needs access to pairwise preferences between candidate explanations, relevant in real-world scenarios where a listener model may not be available. We evaluate our method in image classification tasks, demonstrating improved alignment between explanations and listener preferences across three datasets. Furthermore, we perform a user study that demonstrates our explanations increase communicative utility.

Related papers

Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance. This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z)
Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data. We find that people often mis-interpret the explanations. We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z)
Explanation as a process: user-centric construction of multi-level and multi-modal explanations [0.34410212782758043]
We present a process-based approach that combines multi-level and multi-modal explanations. We use Inductive Logic Programming, an interpretable machine learning approach, to learn a comprehensible model.
arXiv Detail & Related papers (2021-10-07T19:26:21Z)
Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models. Our method is based on projecting model representation to a latent space. Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z)
Evaluating Explanations: How much do explanations from the teacher aid students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning. Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z)
Explaining black-box text classifiers for disease-treatment information extraction [12.323983512532651]
A post-hoc explanation method can approximate the behavior of a black-box AI model. incorporating medical concepts and semantics into the explanation process, our explanator finds semantic relations between inputs and outputs.
arXiv Detail & Related papers (2020-10-21T09:58:00Z)
The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models. We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
Sequential Explanations with Mental Model-Based Policies [20.64968620536829]
We apply a reinforcement learning framework to provide explanations based on the explainee's mental model. We conduct novel online human experiments where explanations are selected and presented to participants. Our results suggest that mental model-based policies may increase interpretability over multiple sequential explanations.
arXiv Detail & Related papers (2020-07-17T14:43:46Z)
Explanations of Black-Box Model Predictions by Contextual Importance and Utility [1.7188280334580195]
We present the Contextual Importance (CI) and Contextual Utility (CU) concepts to extract explanations easily understandable by experts as well as novice users. This method explains the prediction results without transforming the model into an interpretable one. We show the utility of explanations in car selection example and Iris flower classification by presenting complete (i.e. the causes of an individual prediction) and contrastive explanation.
arXiv Detail & Related papers (2020-05-30T06:49:50Z)
LIMEtree: Consistent and Faithful Surrogate Explanations of Multiple Classes [7.031336702345381]
We introduce the novel paradigm of multi-class explanations.<n>We propose a local surrogate model based on multi-output regression trees -- called LIMEtree.<n>On top of strong fidelity guarantees, our implementation delivers a range of diverse explanation types.
arXiv Detail & Related papers (2020-05-04T12:31:29Z)
The Explanation Game: Towards Prediction Explainability through Sparse Communication [6.497816402045099]
We provide a unified perspective of explainability as a problem between an explainer and a layperson. We use this framework to compare several prior approaches for extracting explanations. We propose new embedded methods for explainability, through the use of selective, sparse attention.
arXiv Detail & Related papers (2020-04-28T22:27:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.