Application of LLM Guided Reinforcement Learning in Formation Control with Collision Avoidance
- URL: http://arxiv.org/abs/2507.16382v1
- Date: Tue, 22 Jul 2025 09:26:00 GMT
- Title: Application of LLM Guided Reinforcement Learning in Formation Control with Collision Avoidance
- Authors: Chenhao Yao, Zike Yuan, Xiaoxu Liu, Chi Zhu,
- Abstract summary: Multi-Agent Systems (MAS) excel at accomplishing complex objectives through the collaborative efforts of individual agents.<n>In this paper, we introduce a novel framework that aims to overcome the challenge of designing an effective reward function.<n>By giving large language models (LLMs) on the prioritization of tasks, our framework generates reward functions that can be dynamically adjusted online.
- Score: 1.1718316049475228
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Agent Systems (MAS) excel at accomplishing complex objectives through the collaborative efforts of individual agents. Among the methodologies employed in MAS, Multi-Agent Reinforcement Learning (MARL) stands out as one of the most efficacious algorithms. However, when confronted with the complex objective of Formation Control with Collision Avoidance (FCCA): designing an effective reward function that facilitates swift convergence of the policy network to an optimal solution. In this paper, we introduce a novel framework that aims to overcome this challenge. By giving large language models (LLMs) on the prioritization of tasks and the observable information available to each agent, our framework generates reward functions that can be dynamically adjusted online based on evaluation outcomes by employing more advanced evaluation metrics rather than the rewards themselves. This mechanism enables the MAS to simultaneously achieve formation control and obstacle avoidance in dynamic environments with enhanced efficiency, requiring fewer iterations to reach superior performance levels. Our empirical studies, conducted in both simulation and real-world settings, validate the practicality and effectiveness of our proposed approach.
Related papers
- Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-Agent Reinforcement Learning [15.539607264374242]
Multi-agent reinforcement learning (MARL) has achieved strong performance in cooperative adversarial tasks.<n>We propose a dynamic curriculum learning framework that employs an self-adaptive difficulty adjustment mechanism.<n>Our method improves both training stability and final performance, achieving competitive results against state-of-the-art methods.
arXiv Detail & Related papers (2025-06-09T08:38:18Z) - MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z) - Attention-Augmented Inverse Reinforcement Learning with Graph Convolutions for Multi-Agent Task Allocation [0.29998889086656577]
Multi-agent task allocation (MATA) plays a vital role in cooperative multi-agent systems.<n>Inverse reinforcement learning (IRL)-based framework is proposed to enhance reward function learning and task execution efficiency.<n>Experiments validate the superiority of the proposed method over widely used multi-agent reinforcement learning (MARL) algorithms.
arXiv Detail & Related papers (2025-04-07T13:14:45Z) - Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z) - Leveraging Large Language Models for Effective and Explainable Multi-Agent Credit Assignment [4.406086834602686]
We show how to reformulate credit assignment to the two pattern recognition problems of sequence improvement and attribution.<n>Our approach utilizes a centralized reward-critic which numerically decomposes the environment reward based on the individual contribution of each agent.<n>Both our methods far outperform the state-of-the-art on a variety of benchmarks, including Level-Based Foraging, Robotic Warehouse, and our new Spaceworld benchmark which incorporates collision-related safety constraints.
arXiv Detail & Related papers (2025-02-24T05:56:47Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Reward-free World Models for Online Imitation Learning [25.304836126280424]
We propose a novel approach to online imitation learning that leverages reward-free world models.<n>Our method learns environmental dynamics entirely in latent spaces without reconstruction, enabling efficient and accurate modeling.<n>We evaluate our method on a diverse set of benchmarks, including DMControl, MyoSuite, and ManiSkill2, demonstrating superior empirical performance compared to existing approaches.
arXiv Detail & Related papers (2024-10-17T23:13:32Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - Discovering Individual Rewards in Collective Behavior through Inverse
Multi-Agent Reinforcement Learning [3.4437947384641032]
We introduce an off-policy inverse multi-agent reinforcement learning algorithm (IMARL)
By leveraging demonstrations, our algorithm automatically uncovers the reward function and learns an effective policy for the agents.
The proposed IMARL algorithm is a significant step towards understanding collective dynamics from the perspective of its constituents.
arXiv Detail & Related papers (2023-05-17T20:07:30Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Softmax with Regularization: Better Value Estimation in Multi-Agent
Reinforcement Learning [72.28520951105207]
Overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning.
We propose a novel regularization-based update scheme that penalizes large joint action-values deviating from a baseline.
We show that our method provides a consistent performance improvement on a set of challenging StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-03-22T14:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.