Related papers: Non-local Policy Optimization via Diversity-regularized Collaborative Exploration

Non-local Policy Optimization via Diversity-regularized Collaborative Exploration

URL: http://arxiv.org/abs/2006.07781v1
Date: Sun, 14 Jun 2020 03:31:11 GMT
Title: Non-local Policy Optimization via Diversity-regularized Collaborative Exploration
Authors: Zhenghao Peng, Hao Sun, Bolei Zhou
Abstract summary: We propose a novel non-local policy optimization framework called Diversity-regularized Collaborative Exploration (DiCE) DiCE utilizes a group of heterogeneous agents to explore the environment simultaneously and share the collected experiences. We implement the framework in both on-policy and off-policy settings and the experimental results show that DiCE can achieve substantial improvement over the baselines.
Score: 45.997521480637836
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conventional Reinforcement Learning (RL) algorithms usually have one single agent learning to solve the task independently. As a result, the agent can only explore a limited part of the state-action space while the learned behavior is highly correlated to the agent's previous experience, making the training prone to a local minimum. In this work, we empower RL with the capability of teamwork and propose a novel non-local policy optimization framework called Diversity-regularized Collaborative Exploration (DiCE). DiCE utilizes a group of heterogeneous agents to explore the environment simultaneously and share the collected experiences. A regularization mechanism is further designed to maintain the diversity of the team and modulate the exploration. We implement the framework in both on-policy and off-policy settings and the experimental results show that DiCE can achieve substantial improvement over the baselines in the MuJoCo locomotion tasks.

Related papers

Trajectory First: A Curriculum for Discovering Diverse Policies [17.315583101484147]
Being able to solve a task in diverse ways makes agents more robust to task variations and less prone to local optima.<n> constrained diversity optimization has emerged as a powerful reinforcement learning framework to train a diverse set of agents in parallel.<n>We propose a curriculum that first explores at the trajectory level before learning step-based policies.
arXiv Detail & Related papers (2025-06-02T11:47:51Z)
CCL: Collaborative Curriculum Learning for Sparse-Reward Multi-Agent Reinforcement Learning via Co-evolutionary Task Evolution [4.0873807995771]
Sparse reward environments pose significant challenges in reinforcement learning, especially within multi-agent systems.<n>We propose Collaborative Multi-dimensional Course Learning (CCL), a novel curriculum learning framework that addresses this by (1) refining intermediate tasks for individual agents, (2) using a variational evolutionary algorithm to generate informative subtasks, and (3) co-evolving agents with their environment to enhance training stability.
arXiv Detail & Related papers (2025-05-08T04:23:47Z)
On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations [15.549340968605234]
Federated reinforcement learning (FedRL) enables multiple agents to collaboratively learn a policy without sharing their local trajectories collected during agent-environment interactions. We introduce a emphpersonalized FedRL framework (PFedRL) by taking advantage of possibly shared common structure among agents in heterogeneous environments.
arXiv Detail & Related papers (2024-11-22T15:42:43Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection. Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z)
Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z)
Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments. Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search. We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z)
Celebrating Diversity in Shared Multi-Agent Reinforcement Learning [20.901606233349177]
Deep multi-agent reinforcement learning has shown the promise to solve complex cooperative tasks. In this paper, we aim to introduce diversity in both optimization and representation of shared multi-agent reinforcement learning. Our method achieves state-of-the-art performance on Google Research Football and super hard StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-06-04T00:55:03Z)
Cooperative Heterogeneous Deep Reinforcement Learning [47.97582814287474]
We present a Cooperative Heterogeneous Deep Reinforcement Learning framework that can learn a policy by integrating the advantages of heterogeneous agents. Global agents are off-policy agents that can utilize experiences from the other agents. Local agents are either on-policy agents or population-based evolutionary (EAs) agents that can explore the local area effectively.
arXiv Detail & Related papers (2020-11-02T07:39:09Z)
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn) UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features. Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
Individual specialization in multi-task environments with multiagent reinforcement learners [0.0]
There is a growing interest in Multi-Agent Reinforcement Learning (MARL) as the first steps towards building general intelligent agents. Previous results point us towards increased conditions for coordination, efficiency/fairness, and common-pool resource sharing. We further study coordination in multi-task environments where several rewarding tasks can be performed and thus agents don't necessarily need to perform well in all tasks, but under certain conditions may specialize.
arXiv Detail & Related papers (2019-12-29T15:20:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.