Related papers: Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration

URL: http://arxiv.org/abs/2410.02511v1
Date: Thu, 3 Oct 2024 14:21:23 GMT
Title: Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Authors: Yun Qu, Boyuan Wang, Yuhang Jiang, Jianzhun Shao, Yixiu Mao, Cheems Wang, Chang Liu, Xiangyang Ji,
Abstract summary: This paper introduces LEMAE, choosing to channel informative task-relevant guidance from a knowledgeable Large Language Model (LLM) for Efficient Multi-Agent Exploration. Specifically, we ground linguistic knowledge from LLM into symbolic key states, that are critical for task fulfillment, in a discriminative manner at low inference costs. Benefiting from diminishing redundant explorations, LEMAE outperforms existing SOTA approaches on the challenging benchmarks (e.g., SMAC and MPE) by a large margin, achieving a 10x acceleration in certain scenarios.
Score: 46.938186139700804
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With expansive state-action spaces, efficient multi-agent exploration remains a longstanding challenge in reinforcement learning. Although pursuing novelty, diversity, or uncertainty attracts increasing attention, redundant efforts brought by exploration without proper guidance choices poses a practical issue for the community. This paper introduces a systematic approach, termed LEMAE, choosing to channel informative task-relevant guidance from a knowledgeable Large Language Model (LLM) for Efficient Multi-Agent Exploration. Specifically, we ground linguistic knowledge from LLM into symbolic key states, that are critical for task fulfillment, in a discriminative manner at low LLM inference costs. To unleash the power of key states, we design Subspace-based Hindsight Intrinsic Reward (SHIR) to guide agents toward key states by increasing reward density. Additionally, we build the Key State Memory Tree (KSMT) to track transitions between key states in a specific task for organized exploration. Benefiting from diminishing redundant explorations, LEMAE outperforms existing SOTA approaches on the challenging benchmarks (e.g., SMAC and MPE) by a large margin, achieving a 10x acceleration in certain scenarios.

Related papers

Option Discovery Using LLM-guided Semantic Hierarchical Reinforcement Learning [16.654435148168172]
Large Language Models (LLMs) have shown remarkable promise in reasoning and decision-making. We propose an LLM-guided hierarchical RL framework, termed LDSC, to enhance sample efficiency, generalization, and multi-task adaptability.
arXiv Detail & Related papers (2025-03-24T15:49:56Z)
LLM-Empowered State Representation for Reinforcement Learning [64.3351150030341]
State representations in reinforcement learning often omit critical task-related details. We propose LLM-Empowered State Representation (LESR), a novel approach that utilizes LLM to autonomously generate task-related state representation codes. LESR exhibits high sample efficiency and outperforms state-of-the-art baselines by an average of 29% in accumulated reward in Mujoco tasks and 30% in success rates in Gym-Robotics tasks.
arXiv Detail & Related papers (2024-07-18T07:47:51Z)
WESE: Weak Exploration to Strong Exploitation for LLM Agents [95.6720931773781]
This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE) to enhance LLM agents in solving open-world interactive tasks. WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge. A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task.
arXiv Detail & Related papers (2024-04-11T03:31:54Z)
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient. To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z)
Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning [27.81925751697255]
We propose a novel method for efficient multi-agent exploration in complex scenarios. We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively. By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important underexplored regions.
arXiv Detail & Related papers (2024-02-28T01:45:01Z)
FoX: Formation-aware exploration in multi-agent reinforcement learning [10.554220876480297]
We propose a formation-based equivalence relation on the exploration space and aim to reduce the search space by exploring only meaningful states in different formations. Numerical results show that the proposed FoX framework significantly outperforms the state-of-the-art MARL algorithms on Google Research Football (GRF) and sparse Starcraft II multi-agent challenge (SMAC) tasks.
arXiv Detail & Related papers (2023-08-22T08:39:44Z)
Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA) SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z)
Exploration in Deep Reinforcement Learning: A Comprehensive Survey [24.252352133705735]
Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning (MARL) have achieved significant success across a wide range of domains, such as game AI, autonomous vehicles, robotics and finance. DRL and deep MARL agents are widely known to be sample-inefficient and millions of interactions are usually needed even for relatively simple game settings. This paper provides a comprehensive survey on existing exploration methods in DRL and deep MARL.
arXiv Detail & Related papers (2021-09-14T13:16:33Z)
Decentralized Multi-Agent Reinforcement Learning for Task Offloading Under Uncertainty [24.083871784808473]
Multi-Agent Reinforcement Learning (MARL) is a challenging subarea of Reinforcement Learning. Deep MARL algorithms have been applied to solve different task offloading problems. We show that perturbations in the reward signal can induce decrease in the performance compared to learning with perfect rewards.
arXiv Detail & Related papers (2021-07-16T20:49:30Z)
MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration [52.48362697163477]
Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks. We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning. We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
arXiv Detail & Related papers (2020-06-15T06:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.