Related papers: Zero-Shot Assistance in Novel Decision Problems

Zero-Shot Assistance in Novel Decision Problems

URL: http://arxiv.org/abs/2202.07364v1
Date: Tue, 15 Feb 2022 12:45:42 GMT
Title: Zero-Shot Assistance in Novel Decision Problems
Authors: Sebastiaan De Peuter, Samuel Kaski
Abstract summary: We consider the problem of creating assistants that can help agents - often humans - solve novel sequential decision problems. Instead of aiming to automate, and act in place of the agent as in current approaches, we give the assistant an advisory role and keep the agent in the loop as the main decision maker.
Score: 14.376001248562797
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider the problem of creating assistants that can help agents - often humans - solve novel sequential decision problems, assuming the agent is not able to specify the reward function explicitly to the assistant. Instead of aiming to automate, and act in place of the agent as in current approaches, we give the assistant an advisory role and keep the agent in the loop as the main decision maker. The difficulty is that we must account for potential biases induced by limitations or constraints of the agent which may cause it to seemingly irrationally reject advice. To do this we introduce a novel formalization of assistance that models these biases, allowing the assistant to infer and adapt to them. We then introduce a new method for planning the assistant's advice which can scale to large decision making problems. Finally, we show experimentally that our approach adapts to these agent biases, and results in higher cumulative reward for the agent than automation-based alternatives.

Related papers

Strategy Masking: A Method for Guardrails in Value-based Reinforcement Learning Agents [0.27309692684728604]
We study methods for constructing guardrails for AI agents that use reward functions to learn decision making. We introduce a novel approach, which we call strategy masking, to explicitly learn and then suppress undesirable AI agent behavior.
arXiv Detail & Related papers (2025-01-09T18:43:05Z)
Getting By Goal Misgeneralization With a Little Help From a Mentor [5.012314384895538]
This paper explores whether allowing an agent to ask for help from a supervisor in unfamiliar situations can mitigate this issue. We focus on agents trained with PPO in the CoinRun environment, a setting known to exhibit goal misgeneralization. We find that methods based on the agent's internal state fail to proactively request help, instead waiting until mistakes have already occurred.
arXiv Detail & Related papers (2024-10-28T14:07:41Z)
Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose a novel framework for agent-oriented planning in multi-agent systems, leveraging a fast task decomposition and allocation process. We integrate a feedback loop into the proposed framework to further enhance the effectiveness and robustness of such a problem-solving process.
arXiv Detail & Related papers (2024-10-03T04:07:51Z)
Agent-Aware Training for Agent-Agnostic Action Advising in Deep Reinforcement Learning [37.70609910232786]
Action advising endeavors to leverage supplementary guidance from expert teachers to alleviate the issue of sampling inefficiency in Deep Reinforcement Learning (DRL) Previous agent-specific action advising methods are hindered by imperfections in the agent itself, while agent-agnostic approaches exhibit limited adaptability to the learning agent. We propose a novel framework called Agent-Aware trAining yet Agent-Agnostic Action Advising (A7) to strike a balance between the two.
arXiv Detail & Related papers (2023-11-28T14:09:43Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
Decision Making for Human-in-the-loop Robotic Agents via Uncertainty-Aware Reinforcement Learning [13.184897303302971]
In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. We present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. We show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.
arXiv Detail & Related papers (2023-03-12T17:22:54Z)
When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning [57.53138994155612]
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world. A critical challenge is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table. We propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them.
arXiv Detail & Related papers (2022-10-19T17:57:24Z)
Formalizing the Problem of Side Effect Regularization [81.97441214404247]
We propose a formal criterion for side effect regularization via the assistance game framework. In these games, the agent solves a partially observable Markov decision process. We show that this POMDP is solved by trading off the proxy reward with the agent's ability to achieve a range of future tasks.
arXiv Detail & Related papers (2022-06-23T16:36:13Z)
Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions. By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem. We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them. Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z)
Teaching Humans When To Defer to a Classifier via Examplars [9.851033166756274]
We aim to ensure that human decision makers learn a valid mental model of the agent's strengths and weaknesses. We propose an exemplar-based teaching strategy where humans solve the task with the help of the agent. We present a novel parameterization of the human's mental model of the AI that applies a nearest neighbor rule in local regions.
arXiv Detail & Related papers (2021-11-22T15:52:15Z)
Extending the Hint Factory for the assistance dilemma: A novel, data-driven HelpNeed Predictor for proactive problem-solving help [6.188683567894372]
We present a set of data-driven methods to classify, predict, and prevent unproductive problem-solving steps. We present a HelpNeed classification, that uses prior student data to determine when students are likely to be unproductive. We conclude with suggestions on how these HelpNeed methods could be applied in other well-structured open-ended domains.
arXiv Detail & Related papers (2020-10-08T17:04:03Z)
A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We first show that humans do alter their behavior when the tool is deployed. We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.