Related papers: Learning Multiple Coordinated Agents under Directed Acyclic Graph Constraints

Learning Multiple Coordinated Agents under Directed Acyclic Graph Constraints

URL: http://arxiv.org/abs/2307.07529v1
Date: Thu, 13 Jul 2023 13:41:24 GMT
Title: Learning Multiple Coordinated Agents under Directed Acyclic Graph Constraints
Authors: Jaeyeon Jang, Diego Klabjan, Han Liu, Nital S. Patel, Xiuqi Li, Balakrishnan Ananthanarayanan, Husam Dauod, Tzung-Han Juang
Abstract summary: This paper proposes a novel multi-agent reinforcement learning (MARL) method to learn multiple coordinated agents under directed acyclic graph (DAG) constraints. Unlike existing MARL approaches, our method explicitly exploits the DAG structure between agents to achieve more effective learning performance.
Score: 20.45657219304883
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a novel multi-agent reinforcement learning (MARL) method to learn multiple coordinated agents under directed acyclic graph (DAG) constraints. Unlike existing MARL approaches, our method explicitly exploits the DAG structure between agents to achieve more effective learning performance. Theoretically, we propose a novel surrogate value function based on a MARL model with synthetic rewards (MARLM-SR) and prove that it serves as a lower bound of the optimal value function. Computationally, we propose a practical training algorithm that exploits new notion of leader agent and reward generator and distributor agent to guide the decomposed follower agents to better explore the parameter space in environments with DAG constraints. Empirically, we exploit four DAG environments including a real-world scheduling for one of Intel's high volume packaging and test factory to benchmark our methods and show it outperforms the other non-DAG approaches.

Related papers

Ensemble-MIX: Enhancing Sample Efficiency in Multi-Agent RL Using Ensemble Methods [0.0]
Multi-agent reinforcement learning (MARL) methods have achieved state-of-the-art results on a range of multi-agent tasks.<n>Yet, MARL algorithms require significantly more environment interactions than their single-agent counterparts to converge.<n>We propose a novel algorithm that combines a decomposed centralized critic with decentralized ensemble learning.
arXiv Detail & Related papers (2025-06-03T13:13:15Z)
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models. Controlled Decoding provides a mechanism for aligning a model at inference time without retraining. We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
O-MAPL: Offline Multi-agent Preference Learning [5.4482836906033585]
Inferring reward functions from demonstrations is a key challenge in reinforcement learning (RL) We introduce a novel end-to-end preference-based learning framework for cooperative MARL. Our algorithm outperforms existing methods across various tasks.
arXiv Detail & Related papers (2025-01-31T08:08:20Z)
Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning [7.36961322800571]
deep Multi-Agent Reinforcement Learning (MARL) has demonstrated its potential to tackle complex cooperative tasks. We introduce a novel sample reuse approach that dynamically adjusts policy updates based on observation novelty. We name our method Multi-Agent Novelty-GuidEd sample Reuse (MANGER)
arXiv Detail & Related papers (2024-12-20T03:09:18Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Learning and Calibrating Heterogeneous Bounded Rational Market Behaviour with Multi-Agent Reinforcement Learning [4.40301653518681]
Agent-based models (ABMs) have shown promise for modelling various real world phenomena incompatible with traditional equilibrium analysis. Recent developments in multi-agent reinforcement learning (MARL) offer a way to address this issue from a rationality perspective. We propose a novel technique for representing heterogeneous processing-constrained agents within a MARL framework.
arXiv Detail & Related papers (2024-02-01T17:21:45Z)
Causal Coordinated Concurrent Reinforcement Learning [8.654978787096807]
We propose a novel algorithmic framework for data sharing and coordinated exploration for the purpose of learning more data-efficient and better performing policies under a concurrent reinforcement learning setting. Our algorithm leverages a causal inference algorithm in the form of Additive Noise Model - Mixture Model (ANM-MM) in extracting model parameters governing individual differentials via independence enforcement. We propose a new data sharing scheme based on a similarity measure of the extracted model parameters and demonstrate superior learning speeds on a set of autoregressive, pendulum and cart-pole swing-up tasks.
arXiv Detail & Related papers (2024-01-31T17:20:28Z)
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase. We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs. To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z)
Deep Multi-Agent Reinforcement Learning for Decentralized Active Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning. We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z)
Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning [7.784991832712813]
We introduce a Bayesian network to inaugurate correlations between agents' action selections in their joint policy. We develop practical algorithms to learn the context-aware Bayesian network policies. Empirical results on a range of MARL benchmarks show the benefits of our approach.
arXiv Detail & Related papers (2023-06-02T21:22:27Z)
Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z)
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment. We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z)
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts [52.844741540236285]
This paper investigates the model-based methods in multi-agent reinforcement learning (MARL) We propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy (AORPO)
arXiv Detail & Related papers (2021-05-07T16:20:22Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.