Coagent Networks Revisited
- URL: http://arxiv.org/abs/2001.10474v3
- Date: Wed, 30 Aug 2023 00:10:47 GMT
- Title: Coagent Networks Revisited
- Authors: Modjtaba Shokrian Zini, Mohammad Pedramfar, Matthew Riemer, Ahmadreza
Moradipari, Miao Liu
- Abstract summary: Coagent networks formalize the concept of arbitrary networks of agents that collaborate to take actions in a reinforcement learning environment.
We first provide a unifying perspective on the many diverse examples that fall under coagent networks.
We do so by formalizing the rules of execution in a coagent network, enabled by the novel and intuitive idea of execution paths.
- Score: 10.45819881530349
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Coagent networks formalize the concept of arbitrary networks of stochastic
agents that collaborate to take actions in a reinforcement learning
environment. Prominent examples of coagent networks in action include
approaches to hierarchical reinforcement learning (HRL), such as those using
options, which attempt to address the exploration exploitation trade-off by
introducing abstract actions at different levels by sequencing multiple
stochastic networks within the HRL agents. We first provide a unifying
perspective on the many diverse examples that fall under coagent networks. We
do so by formalizing the rules of execution in a coagent network, enabled by
the novel and intuitive idea of execution paths in a coagent network. Motivated
by parameter sharing in the hierarchical option-critic architecture, we revisit
the coagent network theory and achieve a much shorter proof of the policy
gradient theorem using our idea of execution paths, without any assumption on
how parameters are shared among coagents. We then generalize our setting and
proof to include the scenario where coagents act asynchronously. This new
perspective and theorem also lead to more mathematically accurate and
performant algorithms than those in the existing literature. Lastly, by running
nonstationary RL experiments, we survey the performance and properties of
different generalizations of option-critic models.
Related papers
- Scalable spectral representations for network multiagent control [53.631272539560435]
A popular model for multi-agent control, Network Markov Decision Processes (MDPs) pose a significant challenge to efficient learning.
We first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent.
We design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm.
arXiv Detail & Related papers (2024-10-22T17:45:45Z) - CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection.
Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z) - Context-Aware Bayesian Network Actor-Critic Methods for Cooperative
Multi-Agent Reinforcement Learning [7.784991832712813]
We introduce a Bayesian network to inaugurate correlations between agents' action selections in their joint policy.
We develop practical algorithms to learn the context-aware Bayesian network policies.
Empirical results on a range of MARL benchmarks show the benefits of our approach.
arXiv Detail & Related papers (2023-06-02T21:22:27Z) - Coagent Networks: Generalized and Scaled [44.06183176712763]
Coagent networks for reinforcement learning (RL) provide a powerful and flexible framework for deriving principled learning rules.
This work generalizes the coagent theory and learning rules provided by previous works.
We show that a coagent algorithm with a policy network that does not use backpropagation can scale to a challenging RL domain.
arXiv Detail & Related papers (2023-05-16T22:41:56Z) - Revisiting Some Common Practices in Cooperative Multi-Agent
Reinforcement Learning [11.91425153754564]
We show that in environments with a highly multi-modal reward landscape, value decomposition, and parameter sharing can be problematic and lead to undesired outcomes.
In contrast, policy gradient (PG) methods with individual policies provably converge to an optimal solution in these cases.
We present practical suggestions on implementing multi-agent PG algorithms for either high rewards or diverse emergent behaviors.
arXiv Detail & Related papers (2022-06-15T13:03:05Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - Graph Convolutional Reinforcement Learning for Collaborative Queuing
Agents [6.3120870639037285]
We propose a novel graph-convolution based, multi-agent reinforcement learning approach known as DGN.
We show that our DGN-based approach meets stringent throughput and delay requirements across all scenarios.
arXiv Detail & Related papers (2022-05-24T11:53:20Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs)
Semi-implicit actor (SIA) powered by a flexible policy distribution.
We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z) - Multi-Agent Determinantal Q-Learning [39.79718674655209]
We propose multi-agent determinantal Q-learning. Q-DPP promotes agents to acquire diverse behavioral models.
We demonstrate that Q-DPP generalizes major solutions including VDN, QMIX, and QTRAN on decentralizable cooperative tasks.
arXiv Detail & Related papers (2020-06-02T09:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.