Learning to Assist Agents by Observing Them
- URL: http://arxiv.org/abs/2110.01311v1
- Date: Mon, 4 Oct 2021 10:38:59 GMT
- Title: Learning to Assist Agents by Observing Them
- Authors: Antti Keurulainen (1 and 3), Isak Westerlund (3), Samuel Kaski (1 and
2), and Alexander Ilin (1) ((1) Helsinki Institute for Information Technology
HIIT, Department of Computer Science, Aalto University, (2) Department of
Computer Science, University of Manchester, (3) Bitville Oy, Espoo, Finland)
- Abstract summary: We introduce methods where the capability to create a representation of the behavior is first pre-trained with offline data.
We test the setting in a gridworld where the helper agent has the capability to manipulate the environment of the assisted artificial agents.
- Score: 41.74498230885008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability of an AI agent to assist other agents, such as humans, is an
important and challenging goal, which requires the assisting agent to reason
about the behavior and infer the goals of the assisted agent. Training such an
ability by using reinforcement learning usually requires large amounts of
online training, which is difficult and costly. On the other hand, offline data
about the behavior of the assisted agent might be available, but is non-trivial
to take advantage of by methods such as offline reinforcement learning. We
introduce methods where the capability to create a representation of the
behavior is first pre-trained with offline data, after which only a small
amount of interaction data is needed to learn an assisting policy. We test the
setting in a gridworld where the helper agent has the capability to manipulate
the environment of the assisted artificial agents, and introduce three
different scenarios where the assistance considerably improves the performance
of the assisted agents.
Related papers
- Learning to Assist Humans without Inferring Rewards [65.28156318196397]
We build upon prior work that studies assistance through the lens of empowerment.
An assistive agent aims to maximize the influence of the human's actions.
We prove that these representations estimate a similar notion of empowerment to that studied by prior work.
arXiv Detail & Related papers (2024-11-04T21:31:04Z) - Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households [30.33911147366425]
Smart Help aims to provide proactive yet adaptive support to human agents with diverse disabilities.
We introduce an innovative opponent modeling module that provides a nuanced understanding of the main agent's capabilities and goals.
Our findings illustrate the potential of AI-imbued assistive robots in improving the well-being of vulnerable groups.
arXiv Detail & Related papers (2024-04-13T13:03:59Z) - NOPA: Neurally-guided Online Probabilistic Assistance for Building
Socially Intelligent Home Assistants [79.27554831580309]
We study how to build socially intelligent robots to assist people in their homes.
We focus on assistance with online goal inference, where robots must simultaneously infer humans' goals.
arXiv Detail & Related papers (2023-01-12T18:59:34Z) - Learning to Guide Multiple Heterogeneous Actors from a Single Human
Demonstration via Automatic Curriculum Learning in StarCraft II [0.5911087507716211]
In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors.
Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines.
arXiv Detail & Related papers (2022-05-11T21:53:11Z) - Behaviour-conditioned policies for cooperative reinforcement learning
tasks [41.74498230885008]
In various real-world tasks, an agent needs to cooperate with unknown partner agent types.
Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning.
We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour.
We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability.
arXiv Detail & Related papers (2021-10-04T09:16:41Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Coverage as a Principle for Discovering Transferable Behavior in
Reinforcement Learning [16.12658895065585]
We argue that representation alone is not enough for efficient transfer in challenging domains and explore how to transfer knowledge through behavior.
The behavior of pre-trained policies may be used for solving the task at hand (exploitation) or for collecting useful data to solve the problem (exploration)
arXiv Detail & Related papers (2021-02-24T16:51:02Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - AvE: Assistance via Empowerment [77.08882807208461]
We propose a new paradigm for assistance by instead increasing the human's ability to control their environment.
This task-agnostic objective preserves the person's autonomy and ability to achieve any eventual state.
arXiv Detail & Related papers (2020-06-26T04:40:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.