Related papers: Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs

Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs

URL: http://arxiv.org/abs/2204.01160v1
Date: Sun, 3 Apr 2022 21:00:51 GMT
Title: Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs
Authors: Mustafa Mert \c{C}elikok, Frans A. Oliehoek, Samuel Kaski
Abstract summary: We present a novel formulation of the interaction between the human and the AI as a sequential game. We show that in this case the AI's problem of helping bounded-rational humans make better decisions reduces to a Bayes-adaptive POMDP. We discuss ways in which the machine can learn to improve upon its own limitations as well with the help of the human.
Score: 22.52332536886295
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Centaurs are half-human, half-AI decision-makers where the AI's goal is to complement the human. To do so, the AI must be able to recognize the goals and constraints of the human and have the means to help them. We present a novel formulation of the interaction between the human and the AI as a sequential game where the agents are modelled using Bayesian best-response models. We show that in this case the AI's problem of helping bounded-rational humans make better decisions reduces to a Bayes-adaptive POMDP. In our simulated experiments, we consider an instantiation of our framework for humans who are subjectively optimistic about the AI's future behaviour. Our results show that when equipped with a model of the human, the AI can infer the human's bounds and nudge them towards better decisions. We discuss ways in which the machine can learn to improve upon its own limitations as well with the help of the human. We identify a novel trade-off for centaurs in partially observable tasks: for the AI's actions to be acceptable to the human, the machine must make sure their beliefs are sufficiently aligned, but aligning beliefs might be costly. We present a preliminary theoretical analysis of this trade-off and its dependence on task structure.

Related papers

Modeling Human Beliefs about AI Behavior for Scalable Oversight [15.535954576226207]
As AI systems grow more capable, human feedback becomes increasingly unreliable. This raises the problem of scalable oversight: How can we supervise AI systems that exceed human capabilities? We propose to model the human evaluator's beliefs about the AI system's behavior to better interpret the human's feedback.
arXiv Detail & Related papers (2025-02-28T17:39:55Z)
Aligning Generalisation Between Humans and Machines [74.120848518198]
AI technology can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals.<n>The responsible use of AI and its participation in human-AI teams increasingly shows the need for AI alignment.<n>A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise.
arXiv Detail & Related papers (2024-11-23T18:36:07Z)
Rolling in the deep of cognitive and AI biases [1.556153237434314]
We argue that there is urgent need to understand AI as a sociotechnical system, inseparable from the conditions in which it is designed, developed and deployed. We address this critical issue by following a radical new methodology under which human cognitive biases become core entities in our AI fairness overview. We introduce a new mapping, which justifies the humans to AI biases and we detect relevant fairness intensities and inter-dependencies.
arXiv Detail & Related papers (2024-07-30T21:34:04Z)
On the Utility of Accounting for Human Beliefs about AI Intention in Human-AI Collaboration [9.371527955300323]
We develop a model of human beliefs that captures how humans interpret and reason about their AI partner's intentions. We create an AI agent that incorporates both human behavior and human beliefs when devising its strategy for interacting with humans.
arXiv Detail & Related papers (2024-06-10T06:39:37Z)
Explainable Human-AI Interaction: A Planning Perspective [32.477369282996385]
AI systems need to be explainable to the humans in the loop. We will discuss how the AI agent can use mental models to either conform to human expectations, or change those expectations through explanatory communication. While the main focus of the book is on cooperative scenarios, we will point out how the same mental models can be used for obfuscation and deception.
arXiv Detail & Related papers (2024-05-19T22:22:21Z)
Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making [47.33241893184721]
In AI-assisted decision-making, humans often passively review AI's suggestion and decide whether to accept or reject it as a whole. We propose Human-AI Deliberation, a novel framework to promote human reflection and discussion on conflicting human-AI opinions in decision-making. Based on theories in human deliberation, this framework engages humans and AI in dimension-level opinion elicitation, deliberative discussion, and decision updates.
arXiv Detail & Related papers (2024-03-25T14:34:06Z)
Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time. We discuss how biased models can lead to more negative real-world outcomes for certain groups. If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z)
The Response Shift Paradigm to Quantify Human Trust in AI Recommendations [6.652641137999891]
Explainability, interpretability and how much they affect human trust in AI systems are ultimately problems of human cognition as much as machine learning. We developed and validated a general purpose Human-AI interaction paradigm which quantifies the impact of AI recommendations on human decisions. Our proof-of-principle paradigm allows one to quantitatively compare the rapidly growing set of XAI/IAI approaches in terms of their effect on the end-user.
arXiv Detail & Related papers (2022-02-16T22:02:09Z)
Uncalibrated Models Can Improve Human-AI Collaboration [10.106324182884068]
We show that presenting AI models as more confident than they actually are can improve human-AI performance. We first learn a model for how humans incorporate AI advice using data from thousands of human interactions.
arXiv Detail & Related papers (2022-02-12T04:51:00Z)
Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z)
Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z)
Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance. We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves. Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)
Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making [53.62514158534574]
We study whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI. We show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making.
arXiv Detail & Related papers (2020-01-07T15:33:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.