Zero-Shot Assistance in Novel Decision Problems
- URL: http://arxiv.org/abs/2202.07364v1
- Date: Tue, 15 Feb 2022 12:45:42 GMT
- Title: Zero-Shot Assistance in Novel Decision Problems
- Authors: Sebastiaan De Peuter, Samuel Kaski
- Abstract summary: We consider the problem of creating assistants that can help agents - often humans - solve novel sequential decision problems.
Instead of aiming to automate, and act in place of the agent as in current approaches, we give the assistant an advisory role and keep the agent in the loop as the main decision maker.
- Score: 14.376001248562797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of creating assistants that can help agents - often
humans - solve novel sequential decision problems, assuming the agent is not
able to specify the reward function explicitly to the assistant. Instead of
aiming to automate, and act in place of the agent as in current approaches, we
give the assistant an advisory role and keep the agent in the loop as the main
decision maker. The difficulty is that we must account for potential biases
induced by limitations or constraints of the agent which may cause it to
seemingly irrationally reject advice. To do this we introduce a novel
formalization of assistance that models these biases, allowing the assistant to
infer and adapt to them. We then introduce a new method for planning the
assistant's advice which can scale to large decision making problems. Finally,
we show experimentally that our approach adapts to these agent biases, and
results in higher cumulative reward for the agent than automation-based
alternatives.
Related papers
- Getting By Goal Misgeneralization With a Little Help From a Mentor [5.012314384895538]
This paper explores whether allowing an agent to ask for help from a supervisor in unfamiliar situations can mitigate this issue.
We focus on agents trained with PPO in the CoinRun environment, a setting known to exhibit goal misgeneralization.
We find that methods based on the agent's internal state fail to proactively request help, instead waiting until mistakes have already occurred.
arXiv Detail & Related papers (2024-10-28T14:07:41Z) - Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose a novel framework for agent-oriented planning in multi-agent systems, leveraging a fast task decomposition and allocation process.
We integrate a feedback loop into the proposed framework to further enhance the effectiveness and robustness of such a problem-solving process.
arXiv Detail & Related papers (2024-10-03T04:07:51Z) - Agent-Aware Training for Agent-Agnostic Action Advising in Deep
Reinforcement Learning [37.70609910232786]
Action advising endeavors to leverage supplementary guidance from expert teachers to alleviate the issue of sampling inefficiency in Deep Reinforcement Learning (DRL)
Previous agent-specific action advising methods are hindered by imperfections in the agent itself, while agent-agnostic approaches exhibit limited adaptability to the learning agent.
We propose a novel framework called Agent-Aware trAining yet Agent-Agnostic Action Advising (A7) to strike a balance between the two.
arXiv Detail & Related papers (2023-11-28T14:09:43Z) - Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z) - Decision Making for Human-in-the-loop Robotic Agents via
Uncertainty-Aware Reinforcement Learning [13.184897303302971]
In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed.
We present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task.
We show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.
arXiv Detail & Related papers (2023-03-12T17:22:54Z) - When to Ask for Help: Proactive Interventions in Autonomous
Reinforcement Learning [57.53138994155612]
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.
A critical challenge is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.
We propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them.
arXiv Detail & Related papers (2022-10-19T17:57:24Z) - Formalizing the Problem of Side Effect Regularization [81.97441214404247]
We propose a formal criterion for side effect regularization via the assistance game framework.
In these games, the agent solves a partially observable Markov decision process.
We show that this POMDP is solved by trading off the proxy reward with the agent's ability to achieve a range of future tasks.
arXiv Detail & Related papers (2022-06-23T16:36:13Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Teaching Humans When To Defer to a Classifier via Examplars [9.851033166756274]
We aim to ensure that human decision makers learn a valid mental model of the agent's strengths and weaknesses.
We propose an exemplar-based teaching strategy where humans solve the task with the help of the agent.
We present a novel parameterization of the human's mental model of the AI that applies a nearest neighbor rule in local regions.
arXiv Detail & Related papers (2021-11-22T15:52:15Z) - Extending the Hint Factory for the assistance dilemma: A novel,
data-driven HelpNeed Predictor for proactive problem-solving help [6.188683567894372]
We present a set of data-driven methods to classify, predict, and prevent unproductive problem-solving steps.
We present a HelpNeed classification, that uses prior student data to determine when students are likely to be unproductive.
We conclude with suggestions on how these HelpNeed methods could be applied in other well-structured open-ended domains.
arXiv Detail & Related papers (2020-10-08T17:04:03Z) - A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous
Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions.
We first show that humans do alter their behavior when the tool is deployed.
We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.