Decision Making for Human-in-the-loop Robotic Agents via
Uncertainty-Aware Reinforcement Learning
- URL: http://arxiv.org/abs/2303.06710v2
- Date: Tue, 14 Mar 2023 16:16:58 GMT
- Title: Decision Making for Human-in-the-loop Robotic Agents via
Uncertainty-Aware Reinforcement Learning
- Authors: Siddharth Singi, Zhanpeng He, Alvin Pan, Sandip Patel, Gunnar A.
Sigurdsson, Robinson Piramuthu, Shuran Song, Matei Ciocarlie
- Abstract summary: In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed.
We present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task.
We show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.
- Score: 13.184897303302971
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly
autonomously in solving a task, but can request help from an external expert
when needed. However, knowing when to request such assistance is critical: too
few requests can lead to the robot making mistakes, but too many requests can
overload the expert. In this paper, we present a Reinforcement Learning based
approach to this problem, where a semi-autonomous agent asks for external
assistance when it has low confidence in the eventual success of the task. The
confidence level is computed by estimating the variance of the return from the
current state. We show that this estimate can be iteratively improved during
training using a Bellman-like recursion. On discrete navigation problems with
both fully- and partially-observable state information, we show that our method
makes effective use of a limited budget of expert calls at run-time, despite
having no access to the expert at training time.
Related papers
- Getting By Goal Misgeneralization With a Little Help From a Mentor [5.012314384895538]
This paper explores whether allowing an agent to ask for help from a supervisor in unfamiliar situations can mitigate this issue.
We focus on agents trained with PPO in the CoinRun environment, a setting known to exhibit goal misgeneralization.
We find that methods based on the agent's internal state fail to proactively request help, instead waiting until mistakes have already occurred.
arXiv Detail & Related papers (2024-10-28T14:07:41Z) - Automatic Evaluation of Excavator Operators using Learned Reward
Functions [5.372817906484557]
We propose a novel strategy for the automatic evaluation of excavator operators.
We take into account the internal dynamics of the excavator and the safety criterion at every time step to evaluate the performance.
For a policy learned using these external reward prediction models, our results demonstrate safer solutions.
arXiv Detail & Related papers (2022-11-15T06:58:00Z) - When to Ask for Help: Proactive Interventions in Autonomous
Reinforcement Learning [57.53138994155612]
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.
A critical challenge is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.
We propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them.
arXiv Detail & Related papers (2022-10-19T17:57:24Z) - Learning to Guide Multiple Heterogeneous Actors from a Single Human
Demonstration via Automatic Curriculum Learning in StarCraft II [0.5911087507716211]
In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors.
Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines.
arXiv Detail & Related papers (2022-05-11T21:53:11Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - AvE: Assistance via Empowerment [77.08882807208461]
We propose a new paradigm for assistance by instead increasing the human's ability to control their environment.
This task-agnostic objective preserves the person's autonomy and ability to achieve any eventual state.
arXiv Detail & Related papers (2020-06-26T04:40:11Z) - Should artificial agents ask for help in human-robot collaborative
problem-solving? [0.7251305766151019]
We propose to start from hypotheses derived from an empirical study in a human-robot interaction.
We check whether receiving help from an expert when solving a simple close-ended task allows to accelerate or not the learning of this task.
Our experiences have allowed us to conclude that, whether requested or not, a Q-learning algorithm benefits in the same way from expert help as children do.
arXiv Detail & Related papers (2020-05-25T09:15:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.