Related papers: Learning to Assist Agents by Observing Them

Learning to Assist Agents by Observing Them

URL: http://arxiv.org/abs/2110.01311v1
Date: Mon, 4 Oct 2021 10:38:59 GMT
Title: Learning to Assist Agents by Observing Them
Authors: Antti Keurulainen (1 and 3), Isak Westerlund (3), Samuel Kaski (1 and 2), and Alexander Ilin (1) ((1) Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, (2) Department of Computer Science, University of Manchester, (3) Bitville Oy, Espoo, Finland)
Abstract summary: We introduce methods where the capability to create a representation of the behavior is first pre-trained with offline data. We test the setting in a gridworld where the helper agent has the capability to manipulate the environment of the assisted artificial agents.
Score: 41.74498230885008
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ability of an AI agent to assist other agents, such as humans, is an important and challenging goal, which requires the assisting agent to reason about the behavior and infer the goals of the assisted agent. Training such an ability by using reinforcement learning usually requires large amounts of online training, which is difficult and costly. On the other hand, offline data about the behavior of the assisted agent might be available, but is non-trivial to take advantage of by methods such as offline reinforcement learning. We introduce methods where the capability to create a representation of the behavior is first pre-trained with offline data, after which only a small amount of interaction data is needed to learn an assisting policy. We test the setting in a gridworld where the helper agent has the capability to manipulate the environment of the assisted artificial agents, and introduce three different scenarios where the assistance considerably improves the performance of the assisted agents.

Related papers

Self-Regulation and Requesting Interventions [63.5863047447313]
We propose an offline framework that trains a "helper" policy to request interventions. We score optimal intervention timing with PRMs and train the helper model on these labeled trajectories. This offline approach significantly reduces costly intervention calls during training.
arXiv Detail & Related papers (2025-02-07T00:06:17Z)
Strategy Masking: A Method for Guardrails in Value-based Reinforcement Learning Agents [0.27309692684728604]
We study methods for constructing guardrails for AI agents that use reward functions to learn decision making. We introduce a novel approach, which we call strategy masking, to explicitly learn and then suppress undesirable AI agent behavior.
arXiv Detail & Related papers (2025-01-09T18:43:05Z)
Efficient Training in Multi-Agent Reinforcement Learning: A Communication-Free Framework for the Box-Pushing Problem [0.5524804393257919]
This paper proposes a model called Shared Pool of Information (SPI) for self-organizing systems. SPI enables information to be accessible to all agents and facilitates coordination, reducing force conflicts among agents and enhancing exploration efficiency.
arXiv Detail & Related papers (2024-11-19T05:51:10Z)
Learning to Assist Humans without Inferring Rewards [65.28156318196397]
We build upon prior work that studies assistance through the lens of empowerment. An assistive agent aims to maximize the influence of the human's actions. We prove that these representations estimate a similar notion of empowerment to that studied by prior work.
arXiv Detail & Related papers (2024-11-04T21:31:04Z)
Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households [30.33911147366425]
Smart Help aims to provide proactive yet adaptive support to human agents with diverse disabilities. We introduce an innovative opponent modeling module that provides a nuanced understanding of the main agent's capabilities and goals. Our findings illustrate the potential of AI-imbued assistive robots in improving the well-being of vulnerable groups.
arXiv Detail & Related papers (2024-04-13T13:03:59Z)
NOPA: Neurally-guided Online Probabilistic Assistance for Building Socially Intelligent Home Assistants [79.27554831580309]
We study how to build socially intelligent robots to assist people in their homes. We focus on assistance with online goal inference, where robots must simultaneously infer humans' goals.
arXiv Detail & Related papers (2023-01-12T18:59:34Z)
Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II [0.5911087507716211]
In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors. Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines.
arXiv Detail & Related papers (2022-05-11T21:53:11Z)
Behaviour-conditioned policies for cooperative reinforcement learning tasks [41.74498230885008]
In various real-world tasks, an agent needs to cooperate with unknown partner agent types. Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning. We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour. We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability.
arXiv Detail & Related papers (2021-10-04T09:16:41Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
Coverage as a Principle for Discovering Transferable Behavior in Reinforcement Learning [16.12658895065585]
We argue that representation alone is not enough for efficient transfer in challenging domains and explore how to transfer knowledge through behavior. The behavior of pre-trained policies may be used for solving the task at hand (exploitation) or for collecting useful data to solve the problem (exploration)
arXiv Detail & Related papers (2021-02-24T16:51:02Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
AvE: Assistance via Empowerment [77.08882807208461]
We propose a new paradigm for assistance by instead increasing the human's ability to control their environment. This task-agnostic objective preserves the person's autonomy and ability to achieve any eventual state.
arXiv Detail & Related papers (2020-06-26T04:40:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.