Learning Team-Based Navigation: A Review of Deep Reinforcement Learning
Techniques for Multi-Agent Pathfinding
- URL: http://arxiv.org/abs/2308.05893v2
- Date: Thu, 8 Feb 2024 18:31:14 GMT
- Title: Learning Team-Based Navigation: A Review of Deep Reinforcement Learning
Techniques for Multi-Agent Pathfinding
- Authors: Jaehoon Chung, Jamil Fayyad, Younes Al Younes, and Homayoun Najjaran
- Abstract summary: This review paper focuses on highlighting the integration of DRL-based approaches in MAPF.
We aim to bridge the current gap in evaluating MAPF solutions by addressing the lack of unified evaluation metrics.
Our paper discusses the potential of model-based DRL as a promising future direction and provides its required foundational understanding.
- Score: 2.7898966850590625
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-agent pathfinding (MAPF) is a critical field in many large-scale
robotic applications, often being the fundamental step in multi-agent systems.
The increasing complexity of MAPF in complex and crowded environments, however,
critically diminishes the effectiveness of existing solutions. In contrast to
other studies that have either presented a general overview of the recent
advancements in MAPF or extensively reviewed Deep Reinforcement Learning (DRL)
within multi-agent system settings independently, our work presented in this
review paper focuses on highlighting the integration of DRL-based approaches in
MAPF. Moreover, we aim to bridge the current gap in evaluating MAPF solutions
by addressing the lack of unified evaluation metrics and providing
comprehensive clarification on these metrics. Finally, our paper discusses the
potential of model-based DRL as a promising future direction and provides its
required foundational understanding to address current challenges in MAPF. Our
objective is to assist readers in gaining insight into the current research
direction, providing unified metrics for comparing different MAPF algorithms
and expanding their knowledge of model-based DRL to address the existing
challenges in MAPF.
Related papers
- Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks.
We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model.
Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z) - A Survey on Multimodal Benchmarks: In the Era of Large AI Models [13.299775710527962]
Multimodal Large Language Models (MLLMs) have brought substantial advancements in artificial intelligence.
This survey systematically reviews 211 benchmarks that assess MLLMs across four core domains: understanding, reasoning, generation, and application.
arXiv Detail & Related papers (2024-09-21T15:22:26Z) - From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems.
The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness.
This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z) - MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale [46.35418789518417]
Multi-agent pathfinding is a challenging computational problem that typically requires to find collision-free paths for multiple agents in a shared environment.
We have created a foundation model for the MAPF problems called MAPF-GPT.
Using imitation learning, we have trained a policy on a set of sub-optimal expert trajectories that can generate actions in conditions of partial observability.
We show that MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers on a diverse range of problem instances.
arXiv Detail & Related papers (2024-08-29T12:55:10Z) - Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents.
There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain.
This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z) - Let's reward step by step: Step-Level reward model as the Navigators for
Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase.
We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs.
To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - The Multi-Agent Pickup and Delivery Problem: MAPF, MARL and Its
Warehouse Applications [2.969705152497174]
We study two state-of-the-art solutions to the multi-agent pickup and delivery problem based on different principles.
Specifically, a recent MAPF algorithm called conflict-based search (CBS) and a current MARL algorithm called shared experience actor-critic (SEAC) are studied.
arXiv Detail & Related papers (2022-03-14T13:23:35Z) - Compilation-based Solvers for Multi-Agent Path Finding: a Survey,
Discussion, and Future Opportunities [7.766921168069532]
We show the lessons learned from past developments and current trends in the topic and discuss its wider impact.
Two major approaches to optimal MAPF solving include (1) dedicated search-based methods, which solve MAPF directly, and (2) compilation-based methods that reduce a MAPF instance to an instance in a different well established formalism.
arXiv Detail & Related papers (2021-04-23T20:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.