Non-local Policy Optimization via Diversity-regularized Collaborative
Exploration
- URL: http://arxiv.org/abs/2006.07781v1
- Date: Sun, 14 Jun 2020 03:31:11 GMT
- Title: Non-local Policy Optimization via Diversity-regularized Collaborative
Exploration
- Authors: Zhenghao Peng, Hao Sun, Bolei Zhou
- Abstract summary: We propose a novel non-local policy optimization framework called Diversity-regularized Collaborative Exploration (DiCE)
DiCE utilizes a group of heterogeneous agents to explore the environment simultaneously and share the collected experiences.
We implement the framework in both on-policy and off-policy settings and the experimental results show that DiCE can achieve substantial improvement over the baselines.
- Score: 45.997521480637836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional Reinforcement Learning (RL) algorithms usually have one single
agent learning to solve the task independently. As a result, the agent can only
explore a limited part of the state-action space while the learned behavior is
highly correlated to the agent's previous experience, making the training prone
to a local minimum. In this work, we empower RL with the capability of teamwork
and propose a novel non-local policy optimization framework called
Diversity-regularized Collaborative Exploration (DiCE). DiCE utilizes a group
of heterogeneous agents to explore the environment simultaneously and share the
collected experiences. A regularization mechanism is further designed to
maintain the diversity of the team and modulate the exploration. We implement
the framework in both on-policy and off-policy settings and the experimental
results show that DiCE can achieve substantial improvement over the baselines
in the MuJoCo locomotion tasks.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection.
Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Hierarchical Reinforcement Learning with Opponent Modeling for
Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments.
Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search.
We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z) - Group-Agent Reinforcement Learning [12.915860504511523]
It can largely benefit the reinforcement learning process of each agent if multiple geographically distributed agents perform their separate RL tasks cooperatively.
We propose a distributed RL framework called DDAL (Decentralised Distributed Asynchronous Learning) designed for group-agent reinforcement learning (GARL)
arXiv Detail & Related papers (2022-02-10T16:40:59Z) - Celebrating Diversity in Shared Multi-Agent Reinforcement Learning [20.901606233349177]
Deep multi-agent reinforcement learning has shown the promise to solve complex cooperative tasks.
In this paper, we aim to introduce diversity in both optimization and representation of shared multi-agent reinforcement learning.
Our method achieves state-of-the-art performance on Google Research Football and super hard StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-06-04T00:55:03Z) - Cooperative Heterogeneous Deep Reinforcement Learning [47.97582814287474]
We present a Cooperative Heterogeneous Deep Reinforcement Learning framework that can learn a policy by integrating the advantages of heterogeneous agents.
Global agents are off-policy agents that can utilize experiences from the other agents.
Local agents are either on-policy agents or population-based evolutionary (EAs) agents that can explore the local area effectively.
arXiv Detail & Related papers (2020-11-02T07:39:09Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Individual specialization in multi-task environments with multiagent
reinforcement learners [0.0]
There is a growing interest in Multi-Agent Reinforcement Learning (MARL) as the first steps towards building general intelligent agents.
Previous results point us towards increased conditions for coordination, efficiency/fairness, and common-pool resource sharing.
We further study coordination in multi-task environments where several rewarding tasks can be performed and thus agents don't necessarily need to perform well in all tasks, but under certain conditions may specialize.
arXiv Detail & Related papers (2019-12-29T15:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.