Auditing Local Explanations is Hard
- URL: http://arxiv.org/abs/2407.13281v1
- Date: Thu, 18 Jul 2024 08:34:05 GMT
- Title: Auditing Local Explanations is Hard
- Authors: Robi Bhattacharjee, Ulrike von Luxburg,
- Abstract summary: We investigate an auditing framework in which a third-party auditor or a collective of users attempts to sanity-check explanations.
We prove upper and lower bounds on the amount of queries that are needed for an auditor to succeed within this framework.
Our results suggest that for complex high-dimensional settings, merely providing a pointwise prediction and explanation could be insufficient.
- Score: 14.172657936593582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In sensitive contexts, providers of machine learning algorithms are increasingly required to give explanations for their algorithms' decisions. However, explanation receivers might not trust the provider, who potentially could output misleading or manipulated explanations. In this work, we investigate an auditing framework in which a third-party auditor or a collective of users attempts to sanity-check explanations: they can query model decisions and the corresponding local explanations, pool all the information received, and then check for basic consistency properties. We prove upper and lower bounds on the amount of queries that are needed for an auditor to succeed within this framework. Our results show that successful auditing requires a potentially exorbitant number of queries -- particularly in high dimensional cases. Our analysis also reveals that a key property is the ``locality'' of the provided explanations -- a quantity that so far has not been paid much attention to in the explainability literature. Looking forward, our results suggest that for complex high-dimensional settings, merely providing a pointwise prediction and explanation could be insufficient, as there is no way for the users to verify that the provided explanations are not completely made-up.
Related papers
- Building Interpretable and Reliable Open Information Retriever for New
Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA)
We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query.
We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z) - Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors [58.340159346749964]
We propose a new neural-symbolic method to support end-to-end learning using complex queries with provable reasoning capability.
We develop a new dataset containing ten new types of queries with features that have never been considered.
Our method outperforms previous methods significantly in the new dataset and also surpasses previous methods in the existing dataset at the same time.
arXiv Detail & Related papers (2023-04-14T11:35:35Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Interpretable by Design: Learning Predictors by Composing Interpretable
Queries [8.054701719767293]
We argue that machine learning algorithms should be interpretable by design.
We minimize the expected number of queries needed for accurate prediction.
Experiments on vision and NLP tasks demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2022-07-03T02:40:34Z) - XAudit : A Theoretical Look at Auditing with Explanations [29.55309950026882]
This work formalizes the role of explanations in auditing and investigates if and how model explanations can help audits.
Specifically, we propose explanation-based algorithms for auditing linear classifiers and decision trees for feature sensitivity.
Our results illustrate that Counterfactual explanations are extremely helpful for auditing.
arXiv Detail & Related papers (2022-06-09T19:19:58Z) - Re-Examining Human Annotations for Interpretable NLP [80.81532239566992]
We conduct controlled experiments using crowd-sourced websites on two widely used datasets in Interpretable NLP.
We compare the annotation results obtained from recruiting workers satisfying different levels of qualification.
Our results reveal that the annotation quality is highly subject to the workers' qualification, and workers can be guided to provide certain annotations by the instructions.
arXiv Detail & Related papers (2022-04-10T02:27:30Z) - Generating Fluent Fact Checking Explanations with Unsupervised
Post-Editing [22.5444107755288]
We present an iterative edit-based algorithm that uses only phrase-level edits to perform unsupervised post-editing of ruling comments.
We show that our model generates explanations that are fluent, readable, non-redundant, and cover important information for the fact check.
arXiv Detail & Related papers (2021-12-13T15:31:07Z) - Counterfactual Explanations Can Be Manipulated [40.78019510022835]
We introduce the first framework that describes the vulnerabilities of counterfactual explanations and shows how they can be manipulated.
We show counterfactual explanations may converge to drastically different counterfactuals under a small perturbation indicating they are not robust.
We describe how these models can unfairly provide low-cost recourse for specific subgroups in the data while appearing fair to auditors.
arXiv Detail & Related papers (2021-06-04T18:56:15Z) - Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA [22.76153284711981]
We study whether explanations help users correctly decide when to accept or reject an ODQA system's answer.
Our results show that explanations derived from retrieved evidence passages can outperform strong baselines (calibrated confidence) across modalities.
We show common failure cases of current explanations, emphasize end-to-end evaluation of explanations, and caution against evaluating them in proxy modalities that are different from deployment.
arXiv Detail & Related papers (2020-12-30T08:19:02Z) - Brain-inspired Search Engine Assistant based on Knowledge Graph [53.89429854626489]
DeveloperBot is a brain-inspired search engine assistant named on knowledge graph.
It constructs a multi-layer query graph by splitting a complex multi-constraint query into several ordered constraints.
It then models the constraint reasoning process as subgraph search process inspired by the spreading activation model of cognitive science.
arXiv Detail & Related papers (2020-12-25T06:36:11Z) - Generating Fact Checking Explanations [52.879658637466605]
A crucial piece of the puzzle that is still missing is to understand how to automate the most elaborate part of the process.
This paper provides the first study of how these explanations can be generated automatically based on available claim context.
Our results indicate that optimising both objectives at the same time, rather than training them separately, improves the performance of a fact checking system.
arXiv Detail & Related papers (2020-04-13T05:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.