Related papers: Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts

Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts

URL: http://arxiv.org/abs/2201.10295v1
Date: Tue, 25 Jan 2022 13:12:02 GMT
Title: Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts
Authors: Sebastian Bordt, Mich\`ele Finck, Eric Raidl, Ulrike von Luxburg
Abstract summary: Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms. Many researchers suggest using post-hoc explanation algorithms for this purpose. We show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives.
Score: 12.552080951754963
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms and their functioning, often interpreted as obligations to "explain". Many researchers suggest using post-hoc explanation algorithms for this purpose. In this paper, we combine legal, philosophical and technical arguments to show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives. Indeed, most situations where explanations are requested are adversarial, meaning that the explanation provider and receiver have opposing interests and incentives, so that the provider might manipulate the explanation for her own ends. We show that this fundamental conflict cannot be resolved because of the high degree of ambiguity of post-hoc explanations in realistic application scenarios. As a consequence, post-hoc explanation algorithms are unsuitable to achieve the transparency objectives inherent to the legal norms. Instead, there is a need to more explicitly discuss the objectives underlying "explainability" obligations as these can often be better achieved through other mechanisms. There is an urgent need for a more open and honest discussion regarding the potential and limitations of post-hoc explanations in adversarial contexts, in particular in light of the current negotiations about the European Union's draft Artificial Intelligence Act.

Related papers

A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences [76.73731245899454]
We propose a transparent law reasoning schema enriched with hierarchical factum probandum, evidence, and implicit experience. Inspired by this schema, we introduce the challenging task, which takes a textual case description and outputs a hierarchical structure justifying the final decision. This benchmark paves the way for transparent and accountable AI-assisted law reasoning in the Intelligent Court''
arXiv Detail & Related papers (2025-03-02T10:26:54Z)
Few-shot Policy (de)composition in Conversational Question Answering [54.259440408606515]
We propose a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting. We show that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies. We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning.
arXiv Detail & Related papers (2025-01-20T08:40:15Z)
The explanation dialogues: an expert focus study to understand requirements towards explanations within the GDPR [47.06917254695738]
We present the Explanation Dialogues, an expert focus study to uncover the expectations, reasoning, and understanding of legal experts and practitioners towards XAI. The study consists of an online questionnaire and follow-up interviews, and is centered around a use-case in the credit domain. We find that the presented explanations are hard to understand and lack information, and discuss issues that can arise from the different interests of the data controller and subject.
arXiv Detail & Related papers (2025-01-09T15:50:02Z)
DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval. We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability. Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z)
Explanation Hacking: The perils of algorithmic recourse [2.967024581564439]
We argue that recourse explanations face several conceptual pitfalls and can lead to problematic explanation hacking. As an alternative, we advocate that explanations of AI decisions should aim at understanding.
arXiv Detail & Related papers (2024-03-22T12:49:28Z)
FaithLM: Towards Faithful Explanations for Large Language Models [67.29893340289779]
Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their internal knowledge and reasoning capabilities. The black-box nature of these models complicates the task of explaining their decision-making processes. We introduce FaithLM to explain the decision of LLMs with natural language (NL) explanations.
arXiv Detail & Related papers (2024-02-07T09:09:14Z)
Clash of the Explainers: Argumentation for Context-Appropriate Explanations [6.8285745209093145]
There is no single approach that is best suited for a given context. For AI explainability to be effective, explanations and how they are presented needs to be oriented towards the stakeholder receiving the explanation. We propose a modular reasoning system consisting of a given mental model of the relevant stakeholder, a reasoner component that solves the argumentation problem generated by a multi-explainer component, and an AI model that is to be explained suitably to the stakeholder of interest.
arXiv Detail & Related papers (2023-12-12T09:52:30Z)
HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision [118.0818807474809]
This work proposes a principled, probabilistic approach for training explainable multi-hop QA systems without rationale supervision. Our approach performs multi-hop reasoning by explicitly modeling rationales as sets, enabling the model to capture interactions between documents and sentences within a document.
arXiv Detail & Related papers (2023-05-23T16:53:49Z)
Disagreement amongst counterfactual explanations: How transparency can be deceptive [0.0]
Counterfactual explanations are increasingly used as Explainable Artificial Intelligence technique. Not every algorithm creates uniform explanations for the same instance. Ethical issues arise when malicious agents use this diversity to fairwash an unfair machine learning model.
arXiv Detail & Related papers (2023-04-25T09:15:37Z)
Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data. We find that people often mis-interpret the explanations. We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z)
Making Things Explainable vs Explaining: Requirements and Challenges under the GDPR [2.578242050187029]
ExplanatorY AI (YAI) builds over XAI with the goal to collect and organize explainable information. We represent the problem of generating explanations for Automated Decision-Making systems (ADMs) into the identification of an appropriate path over an explanatory space.
arXiv Detail & Related papers (2021-10-02T08:48:47Z)
Prompting Contrastive Explanations for Commonsense Reasoning Tasks [74.7346558082693]
Large pretrained language models (PLMs) can achieve near-human performance on commonsense reasoning tasks. We show how to use these same models to generate human-interpretable evidence.
arXiv Detail & Related papers (2021-06-12T17:06:13Z)
Aligning Faithful Interpretations with their Social Attribution [58.13152510843004]
We find that the requirement of model interpretations to be faithful is vague and incomplete. We identify that the problem is a misalignment between the causal chain of decisions (causal attribution) and the attribution of human behavior to the interpretation (social attribution)
arXiv Detail & Related papers (2020-06-01T16:45:38Z)
Algorithmic Recourse: from Counterfactual Explanations to Interventions [16.9979815165902]
We argue that counterfactual explanations inform an individual where they need to get to, but not how to get there. Instead, we propose a shift of paradigm from recourse via nearest counterfactual explanations to recourse through minimal interventions.
arXiv Detail & Related papers (2020-02-14T22:49:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.