Related papers: Explanations are a means to an end

Explanations are a means to an end

URL: http://arxiv.org/abs/2506.22740v1
Date: Sat, 28 Jun 2025 03:04:21 GMT
Title: Explanations are a means to an end
Authors: Jessica Hullman, Ziyang Guo, Berk Ustun,
Abstract summary: We argue that explanations should be designed and evaluated with a specific end in mind.<n>We show how this functionally-grounded approach can be applied across diverse use cases.
Score: 19.858633174167334
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern methods for explainable machine learning are designed to describe how models map inputs to outputs--without deep consideration of how these explanations will be used in practice. This paper argues that explanations should be designed and evaluated with a specific end in mind. We describe how to formalize this end in a framework based in statistical decision theory. We show how this functionally-grounded approach can be applied across diverse use cases, such as clinical decision support, providing recourse, or debugging. We demonstrate its use to characterize the maximum "boost" in performance on a particular task that an explanation could provide an idealized decision-maker, preventing misuse due to ambiguity by forcing researchers to specify concrete use cases that can be analyzed in light of models of expected explanation use. We argue that evaluation should meld theoretical and empirical perspectives on the value of explanation, and contribute definitions that span these perspectives.

Related papers

Estimation of Concept Explanations Should be Uncertainty Aware [39.598213804572396]
We study a specific kind called Concept Explanations, where the goal is to interpret a model using human-understandable concepts. Although popular for their easy interpretation, concept explanations are known to be noisy. We propose an uncertainty-aware Bayesian estimation method to address these issues, which readily improved the quality of explanations.
arXiv Detail & Related papers (2023-12-13T11:17:27Z)
Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
Counterfactuals of Counterfactuals: a back-translation-inspired approach to analyse counterfactual editors [3.4253416336476246]
We focus on the analysis of counterfactual, contrastive explanations. We propose a new back translation-inspired evaluation methodology. We show that by iteratively feeding the counterfactual to the explainer we can obtain valuable insights into the behaviour of both the predictor and the explainer models.
arXiv Detail & Related papers (2023-05-26T16:04:28Z)
Explainability in Process Outcome Prediction: Guidelines to Obtain Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction. This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z)
Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach [1.3750624267664155]
We produce estimates of the uncertainty of a given explanation by measuring the consensus amongst a set of diverse bootstrapped surrogate explainers. We empirically illustrate the properties of this approach through experiments on state-of-the-art Convolutional Neural Network ensembles.
arXiv Detail & Related papers (2021-11-17T13:55:58Z)
Diagnostics-Guided Explanation Generation [32.97930902104502]
Explanations shed light on a machine learning model's rationales and can aid in identifying deficiencies in its reasoning process. We show how to optimise for several diagnostic properties when training a model to generate sentence-level explanations.
arXiv Detail & Related papers (2021-09-08T16:27:52Z)
Evaluating Explanations: How much do explanations from the teacher aid students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning. Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z)
Towards Interpretable Reasoning over Paragraph Effects in Situation [126.65672196760345]
We focus on the task of reasoning over paragraph effects in situation, which requires a model to understand the cause and effect. We propose a sequential approach for this task which explicitly models each step of the reasoning process with neural network modules. In particular, five reasoning modules are designed and learned in an end-to-end manner, which leads to a more interpretable model.
arXiv Detail & Related papers (2020-10-03T04:03:52Z)
A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques. We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis. We obtain new explanations that are loosely necessary and sufficient for a prediction. We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.