A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models
- URL: http://arxiv.org/abs/2410.16024v1
- Date: Mon, 21 Oct 2024 13:58:38 GMT
- Title: A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models
- Authors: Yue Deng, Weiyu Ma, Yuxin Fan, Yin Zhang, Haifeng Zhang, Jian Zhao,
- Abstract summary: StarCraft Multi-Agent Challenge (SMAC) is one of the most commonly used experimental environments in multi-agent reinforcement learning (MARL)
Traditional MARL algorithms often require interacting with the environment for up to 1 million steps to train a model.
In this paper, we propose a novel approach to solving SMAC tasks called LLM-SMAC.
- Score: 8.457552813123597
- License:
- Abstract: StarCraft Multi-Agent Challenge (SMAC) is one of the most commonly used experimental environments in multi-agent reinforcement learning (MARL), where the specific task is to control a set number of allied units to defeat enemy forces. Traditional MARL algorithms often require interacting with the environment for up to 1 million steps to train a model, and the resulting policies are typically non-interpretable with weak transferability. In this paper, we propose a novel approach to solving SMAC tasks called LLM-SMAC. In our framework, agents leverage large language models (LLMs) to generate decision tree code by providing task descriptions. The model is further self-reflection using feedback from the rewards provided by the environment. We conduct experiments in the SMAC and demonstrate that our method can produce high-quality, interpretable decision trees with minimal environmental exploration. Moreover, these models exhibit strong transferability, successfully applying to similar SMAC environments without modification. We believe this approach offers a new direction for solving decision-making tasks in the future.
Related papers
- MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.
We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.
We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI [7.040779338576156]
Large Language Models (LLMs) can generate text planning or control code for robots.
These methods still face challenges in terms of flexibility and applicability across different environments.
We propose EnvBridge to enhance the adaptability and robustness of robotic manipulation agents.
arXiv Detail & Related papers (2024-10-22T11:52:22Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Efficient Multi-agent Reinforcement Learning by Planning [33.51282615335009]
Multi-agent reinforcement learning (MARL) algorithms have accomplished remarkable breakthroughs in solving large-scale decision-making tasks.
Most existing MARL algorithms are model-free, limiting sample efficiency and hindering their applicability in more challenging scenarios.
We propose the MAZero algorithm, which combines a centralized model with Monte Carlo Tree Search (MCTS) for policy search.
arXiv Detail & Related papers (2024-05-20T04:36:02Z) - Breaking Down the Task: A Unit-Grained Hybrid Training Framework for
Vision and Language Decision Making [19.87916700767421]
Vision language decision making (VLDM) is a challenging multimodal task.
From an environment perspective, we find that task episodes can be divided into fine-grained textitunits
We propose a novel hybrid-training framework that enables active exploration in the environment and reduces the exposure bias.
arXiv Detail & Related papers (2023-07-16T11:54:16Z) - Efficient Distributed Framework for Collaborative Multi-Agent
Reinforcement Learning [17.57163419315147]
Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers.
There are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency.
In this paper, we design an distributed MARL framework based on the actor-work-learner architecture.
arXiv Detail & Related papers (2022-05-11T03:12:49Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Divergence-Regularized Multi-Agent Actor-Critic [17.995905582226467]
We propose a novel off-policy cooperative MARL framework, divergence-regularized multi-agent actor-critic (DMAC)
DMAC is a flexible framework and can be combined with many existing MARL algorithms.
We empirically show that DMAC substantially improves the performance of existing MARL algorithms.
arXiv Detail & Related papers (2021-10-01T10:27:42Z) - Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces.
It uses latent variables to model generalizable learning patterns.
At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.