Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial
Contexts
- URL: http://arxiv.org/abs/2201.10295v1
- Date: Tue, 25 Jan 2022 13:12:02 GMT
- Title: Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial
Contexts
- Authors: Sebastian Bordt, Mich\`ele Finck, Eric Raidl, Ulrike von Luxburg
- Abstract summary: Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms.
Many researchers suggest using post-hoc explanation algorithms for this purpose.
We show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives.
- Score: 12.552080951754963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing and planned legislation stipulates various obligations to provide
information about machine learning algorithms and their functioning, often
interpreted as obligations to "explain". Many researchers suggest using
post-hoc explanation algorithms for this purpose. In this paper, we combine
legal, philosophical and technical arguments to show that post-hoc explanation
algorithms are unsuitable to achieve the law's objectives. Indeed, most
situations where explanations are requested are adversarial, meaning that the
explanation provider and receiver have opposing interests and incentives, so
that the provider might manipulate the explanation for her own ends. We show
that this fundamental conflict cannot be resolved because of the high degree of
ambiguity of post-hoc explanations in realistic application scenarios. As a
consequence, post-hoc explanation algorithms are unsuitable to achieve the
transparency objectives inherent to the legal norms. Instead, there is a need
to more explicitly discuss the objectives underlying "explainability"
obligations as these can often be better achieved through other mechanisms.
There is an urgent need for a more open and honest discussion regarding the
potential and limitations of post-hoc explanations in adversarial contexts, in
particular in light of the current negotiations about the European Union's
draft Artificial Intelligence Act.
Related papers
- DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - Explanation Hacking: The perils of algorithmic recourse [2.967024581564439]
We argue that recourse explanations face several conceptual pitfalls and can lead to problematic explanation hacking.
As an alternative, we advocate that explanations of AI decisions should aim at understanding.
arXiv Detail & Related papers (2024-03-22T12:49:28Z) - FaithLM: Towards Faithful Explanations for Large Language Models [67.29893340289779]
Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their internal knowledge and reasoning capabilities.
The black-box nature of these models complicates the task of explaining their decision-making processes.
We introduce FaithLM to explain the decision of LLMs with natural language (NL) explanations.
arXiv Detail & Related papers (2024-02-07T09:09:14Z) - Clash of the Explainers: Argumentation for Context-Appropriate
Explanations [6.8285745209093145]
There is no single approach that is best suited for a given context.
For AI explainability to be effective, explanations and how they are presented needs to be oriented towards the stakeholder receiving the explanation.
We propose a modular reasoning system consisting of a given mental model of the relevant stakeholder, a reasoner component that solves the argumentation problem generated by a multi-explainer component, and an AI model that is to be explained suitably to the stakeholder of interest.
arXiv Detail & Related papers (2023-12-12T09:52:30Z) - HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale
Supervision [118.0818807474809]
This work proposes a principled, probabilistic approach for training explainable multi-hop QA systems without rationale supervision.
Our approach performs multi-hop reasoning by explicitly modeling rationales as sets, enabling the model to capture interactions between documents and sentences within a document.
arXiv Detail & Related papers (2023-05-23T16:53:49Z) - Disagreement amongst counterfactual explanations: How transparency can
be deceptive [0.0]
Counterfactual explanations are increasingly used as Explainable Artificial Intelligence technique.
Not every algorithm creates uniform explanations for the same instance.
Ethical issues arise when malicious agents use this diversity to fairwash an unfair machine learning model.
arXiv Detail & Related papers (2023-04-25T09:15:37Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - Making Things Explainable vs Explaining: Requirements and Challenges
under the GDPR [2.578242050187029]
ExplanatorY AI (YAI) builds over XAI with the goal to collect and organize explainable information.
We represent the problem of generating explanations for Automated Decision-Making systems (ADMs) into the identification of an appropriate path over an explanatory space.
arXiv Detail & Related papers (2021-10-02T08:48:47Z) - Prompting Contrastive Explanations for Commonsense Reasoning Tasks [74.7346558082693]
Large pretrained language models (PLMs) can achieve near-human performance on commonsense reasoning tasks.
We show how to use these same models to generate human-interpretable evidence.
arXiv Detail & Related papers (2021-06-12T17:06:13Z) - Aligning Faithful Interpretations with their Social Attribution [58.13152510843004]
We find that the requirement of model interpretations to be faithful is vague and incomplete.
We identify that the problem is a misalignment between the causal chain of decisions (causal attribution) and the attribution of human behavior to the interpretation (social attribution)
arXiv Detail & Related papers (2020-06-01T16:45:38Z) - Algorithmic Recourse: from Counterfactual Explanations to Interventions [16.9979815165902]
We argue that counterfactual explanations inform an individual where they need to get to, but not how to get there.
Instead, we propose a shift of paradigm from recourse via nearest counterfactual explanations to recourse through minimal interventions.
arXiv Detail & Related papers (2020-02-14T22:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.