Related papers: Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning

Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning

URL: http://arxiv.org/abs/2301.11153v1
Date: Thu, 26 Jan 2023 15:00:23 GMT
Title: Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning
Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Kate Larson and Mark Crowley
Abstract summary: This paper considers the problem of simultaneously learning from multiple independent advisors in multi-agent reinforcement learning. We provide principled algorithms that incorporate a set of advisors by both evaluating the advisors at each state and subsequently using the advisors to guide action selection.
Score: 15.195932300563541
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-agent reinforcement learning typically suffers from the problem of sample inefficiency, where learning suitable policies involves the use of many data samples. Learning from external demonstrators is a possible solution that mitigates this problem. However, most prior approaches in this area assume the presence of a single demonstrator. Leveraging multiple knowledge sources (i.e., advisors) with expertise in distinct aspects of the environment could substantially speed up learning in complex environments. This paper considers the problem of simultaneously learning from multiple independent advisors in multi-agent reinforcement learning. The approach leverages a two-level Q-learning architecture, and extends this framework from single-agent to multi-agent settings. We provide principled algorithms that incorporate a set of advisors by both evaluating the advisors at each state and subsequently using the advisors to guide action selection. We also provide theoretical convergence and sample complexity guarantees. Experimentally, we validate our approach in three different test-beds and show that our algorithms give better performances than baselines, can effectively integrate the combined expertise of different advisors, and learn to ignore bad advice.

Related papers

From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Two-stage Learning-to-Defer for Multi-Task Learning [3.4289478404209826]
We introduce a Learning-to-Defer approach for multi-task learning that encompasses both classification and regression tasks. Our two-stage approach utilizes a rejector that defers decisions to the most accurate agent among a pre-trained joint-regressor models and one or more external experts.
arXiv Detail & Related papers (2024-10-21T07:44:57Z)
MADDM: Multi-Advisor Dynamic Binary Decision-Making by Maximizing the Utility [8.212621730577897]
We propose a novel strategy for optimally selecting a set of advisers in a sequential binary decision-making setting. We assume no access to ground truth and no prior knowledge about the reliability of advisers.
arXiv Detail & Related papers (2023-05-15T14:13:47Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning. Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z)
Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments [0.9281671380673306]
We show that independent algorithms can perform on par with multi-agent algorithms in cooperative and competitive settings. We also show that adding recurrence improves the learning of independent algorithms in cooperative partially observable environments.
arXiv Detail & Related papers (2021-11-01T17:14:38Z)
Multi-Agent Advisor Q-Learning [18.8931184962221]
We provide a principled framework for incorporating action recommendations from online sub-optimal advisors in multi-agent settings. We present two novel Q-learning based algorithms: ADMIRAL - Decision Making (ADMIRAL-DM) and ADMIRAL - Advisor Evaluation (ADMIRAL-AE) We analyze the algorithms theoretically and provide fixed-point guarantees regarding their learning in general-sum games.
arXiv Detail & Related papers (2021-10-26T00:21:15Z)
Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments [62.997667081978825]
We consider the problem of multi-agent navigation in partially observable grid environments. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
arXiv Detail & Related papers (2021-08-13T09:44:47Z)
Learning with Instance Bundles for Reading Comprehension [61.823444215188296]
We introduce new supervision techniques that compare question-answer scores across multiple related instances. Specifically, we normalize these scores across various neighborhoods of closely contrasting questions and/or answers. We empirically demonstrate the effectiveness of training with instance bundles on two datasets.
arXiv Detail & Related papers (2021-04-18T06:17:54Z)
Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning [11.292086312664383]
Our proposed algorithm, called Shared Experience Actor-Critic (SEAC), applies experience sharing in an actor-critic framework. We evaluate SEAC in a collection of sparse-reward multi-agent environments and find that it consistently outperforms two baselines and two state-of-the-art algorithms.
arXiv Detail & Related papers (2020-06-12T13:24:50Z)
Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning [0.2538209532048867]
This paper proposes a novel knowledge transfer framework in MARL, PAT (Parallel Attentional Transfer) We design two acting modes in PAT, student mode and self-learning mode. When agents are unfamiliar with the environment, the shared attention mechanism in student mode effectively selects learning knowledge from other agents to decide agents' actions.
arXiv Detail & Related papers (2020-03-29T17:42:00Z)
Provable Representation Learning for Imitation Learning via Bi-level Optimization [60.059520774789654]
A common strategy in modern learning systems is to learn a representation that is useful for many tasks. We study this strategy in the imitation learning setting for Markov decision processes (MDPs) where multiple experts' trajectories are available. We instantiate this framework for the imitation learning settings of behavior cloning and observation-alone.
arXiv Detail & Related papers (2020-02-24T21:03:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.