Manipulation Risks in Explainable AI: The Implications of the
Disagreement Problem
- URL: http://arxiv.org/abs/2306.13885v2
- Date: Tue, 27 Jun 2023 08:42:34 GMT
- Title: Manipulation Risks in Explainable AI: The Implications of the
Disagreement Problem
- Authors: Sofie Goethals and David Martens and Theodoros Evgeniou
- Abstract summary: We provide an overview of the different strategies explanation providers could deploy to adapt the returned explanation to their benefit.
We analyse several objectives and concrete scenarios the providers could have to engage in.
It is crucial to investigate this issue now, before these methods are widely implemented, and propose some mitigation strategies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial Intelligence (AI) systems are increasingly used in high-stakes
domains of our life, increasing the need to explain these decisions and to make
sure that they are aligned with how we want the decision to be made. The field
of Explainable AI (XAI) has emerged in response. However, it faces a
significant challenge known as the disagreement problem, where multiple
explanations are possible for the same AI decision or prediction. While the
existence of the disagreement problem is acknowledged, the potential
implications associated with this problem have not yet been widely studied.
First, we provide an overview of the different strategies explanation providers
could deploy to adapt the returned explanation to their benefit. We make a
distinction between strategies that attack the machine learning model or
underlying data to influence the explanations, and strategies that leverage the
explanation phase directly. Next, we analyse several objectives and concrete
scenarios the providers could have to engage in this behavior, and the
potential dangerous consequences this manipulative behavior could have on
society. We emphasize that it is crucial to investigate this issue now, before
these methods are widely implemented, and propose some mitigation strategies.
Related papers
- Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - In Search of Verifiability: Explanations Rarely Enable Complementary
Performance in AI-Advised Decision Making [25.18203172421461]
We argue explanations are only useful to the extent that they allow a human decision maker to verify the correctness of an AI's prediction.
We also compare the objective of complementary performance with that of appropriate reliance, decomposing the latter into the notions of outcome-graded and strategy-graded reliance.
arXiv Detail & Related papers (2023-05-12T18:28:04Z) - Understanding the Role of Human Intuition on Reliance in Human-AI
Decision-Making with Explanations [44.01143305912054]
We study how decision-makers' intuition affects their use of AI predictions and explanations.
Our results identify three types of intuition involved in reasoning about AI predictions and explanations.
We use these pathways to explain why feature-based explanations did not improve participants' decision outcomes and increased their overreliance on AI.
arXiv Detail & Related papers (2023-01-18T01:33:50Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Making Things Explainable vs Explaining: Requirements and Challenges
under the GDPR [2.578242050187029]
ExplanatorY AI (YAI) builds over XAI with the goal to collect and organize explainable information.
We represent the problem of generating explanations for Automated Decision-Making systems (ADMs) into the identification of an appropriate path over an explanatory space.
arXiv Detail & Related papers (2021-10-02T08:48:47Z) - On the Importance of Domain-specific Explanations in AI-based
Cybersecurity Systems (Technical Report) [7.316266670238795]
Lack of understanding of such decisions can be a major drawback in critical domains such as those related to cybersecurity.
In this paper we make three contributions: (i) proposal and discussion of desiderata for the explanation of outputs generated by AI-based cybersecurity systems; (ii) a comparative analysis of approaches in the literature on Explainable Artificial Intelligence (XAI) under the lens of both our desiderata and further dimensions that are typically used for examining XAI approaches; and (iii) a general architecture that can serve as a roadmap for guiding research efforts towards the development of explainable AI-based cybersecurity systems.
arXiv Detail & Related papers (2021-08-02T22:55:13Z) - The Who in XAI: How AI Background Shapes Perceptions of AI Explanations [61.49776160925216]
We conduct a mixed-methods study of how two different groups--people with and without AI background--perceive different types of AI explanations.
We find that (1) both groups showed unwarranted faith in numbers for different reasons and (2) each group found value in different explanations beyond their intended design.
arXiv Detail & Related papers (2021-07-28T17:32:04Z) - Individual Explanations in Machine Learning Models: A Case Study on
Poverty Estimation [63.18666008322476]
Machine learning methods are being increasingly applied in sensitive societal contexts.
The present case study has two main objectives. First, to expose these challenges and how they affect the use of relevant and novel explanations methods.
And second, to present a set of strategies that mitigate such challenges, as faced when implementing explanation methods in a relevant application domain.
arXiv Detail & Related papers (2021-04-09T01:54:58Z) - Explainable AI and Adoption of Algorithmic Advisors: an Experimental
Study [0.6875312133832077]
We develop an experimental methodology where participants play a web-based game, during which they receive advice from either a human or an algorithmic advisor.
We evaluate whether the different types of explanations affect the readiness to adopt, willingness to pay and trust a financial AI consultant.
We find that the types of explanations that promote adoption during first encounter differ from those that are most successful following failure or when cost is involved.
arXiv Detail & Related papers (2021-01-05T09:34:38Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z) - Learning from Learning Machines: Optimisation, Rules, and Social Norms [91.3755431537592]
It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making.
Recent successes of deep learning for AI suggest that more implicit specifications work better than explicit ones for solving such problems.
arXiv Detail & Related papers (2019-12-29T17:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.