Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding
- URL: http://arxiv.org/abs/2505.12623v1
- Date: Mon, 19 May 2025 02:12:29 GMT
- Title: Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding
- Authors: Keisuke Okumura, Hiroki Nagai,
- Abstract summary: This paper studies two simple yet effective techniques for tiebreaking in PIBT.<n>The first technique allows an agent to intelligently dodge another, taking into account whether each action will hinder the progress of the next timestep.<n>The second technique is to learn, through multiple PIBT runs, how an action causes regret in others and to use this information to minimise regret collectively.
- Score: 10.174848090916669
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: PIBT is a computationally lightweight algorithm that can be applied to a variety of multi-agent pathfinding (MAPF) problems, generating the next collision-free locations of agents given another. Because of its simplicity and scalability, it is becoming a popular underlying scheme for recent large-scale MAPF methods involving several hundreds or thousands of agents. Vanilla PIBT makes agents behave greedily towards their assigned goals, while agents typically have multiple best actions, since the graph shortest path is not always unique. Consequently, tiebreaking about how to choose between these actions significantly affects resulting solutions. This paper studies two simple yet effective techniques for tiebreaking in PIBT, without compromising its computational advantage. The first technique allows an agent to intelligently dodge another, taking into account whether each action will hinder the progress of the next timestep. The second technique is to learn, through multiple PIBT runs, how an action causes regret in others and to use this information to minimise regret collectively. Our empirical results demonstrate that these techniques can reduce the solution cost of one-shot MAPF and improve the throughput of lifelong MAPF. For instance, in densely populated one-shot cases, the combined use of these tiebreaks achieves improvements of around 10-20% in sum-of-costs, without significantly compromising the speed of a PIBT-based planner.
Related papers
- WinkTPG: An Execution Framework for Multi-Agent Path Finding Using Temporal Reasoning [7.2069924323665235]
Planning collision-free paths for a large group of agents is a challenging problem with numerous real-world applications.<n>We propose kinodynamic Temporal Plan Graph Planning (kTPG), a multi-agent speed optimization algorithm that efficiently refines a MAPF plan into a kinodynamically feasible plan.<n>Building on kTPG, we propose Windowed kTPG (WinkTPG), a MAPF execution framework that incrementally refines MAPF plans using a window-based mechanism.
arXiv Detail & Related papers (2025-08-02T21:33:08Z) - Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments [55.044159987218436]
Large language models (LLMs) have demonstrated strong planning and decision-making capabilities in complex embodied environments.<n>We take a first step toward exploring the early-exit behavior for LLM-based agents.
arXiv Detail & Related papers (2025-05-23T08:23:36Z) - Loosely Synchronized Rule-Based Planning for Multi-Agent Path Finding with Asynchronous Actions [5.5233853454863615]
Multi-Agent Path Finding (MAPF) seeks collision-free paths for multiple agents from their respective starting locations to their respective goal locations.<n>Although many MAPF algorithms can handle up to thousands of agents, they usually rely on the assumption that each action of the agent takes a time unit.<n>This paper develops new planners that lie on the other end of the spectrum, trading off solution quality for scalability.
arXiv Detail & Related papers (2024-12-16T11:36:24Z) - Cooperative Reward Shaping for Multi-Agent Pathfinding [4.244426154524592]
The primary objective of Multi-Agent Pathfinding (MAPF) is to plan efficient and conflict-free paths for all agents.
Traditional multi-agent path planning algorithms struggle to achieve efficient distributed path planning for multiple agents.
This letter introduces a unique reward shaping technique based on Independent Q-Learning (IQL)
arXiv Detail & Related papers (2024-07-15T02:44:41Z) - Refined Sample Complexity for Markov Games with Independent Linear Function Approximation [49.5660193419984]
Markov Games (MG) is an important model for Multi-Agent Reinforcement Learning (MARL)
This paper first refines the AVLPR framework by Wang et al. (2023), with an insight of designing pessimistic estimation of the sub-optimality gap.
We give the first algorithm that tackles the curse of multi-agents, attains the optimal $O(T-1/2) convergence rate, and avoids $textpoly(A_max)$ dependency simultaneously.
arXiv Detail & Related papers (2024-02-11T01:51:15Z) - Scalable Mechanism Design for Multi-Agent Path Finding [87.40027406028425]
Multi-Agent Path Finding (MAPF) involves determining paths for multiple agents to travel simultaneously and collision-free through a shared area toward given goal locations.
Finding an optimal solution is often computationally infeasible, making the use of approximate, suboptimal algorithms essential.
We introduce the problem of scalable mechanism design for MAPF and propose three strategyproof mechanisms, two of which even use approximate MAPF algorithms.
arXiv Detail & Related papers (2024-01-30T14:26:04Z) - Cooperative Thresholded Lasso for Sparse Linear Bandit [6.52540785559241]
We present a novel approach to address the multi-agent sparse contextual linear bandit problem.
It is first algorithm that tackles row-wise distributed data in sparse linear bandits.
It is widely applicable to high-dimensional multi-agent problems where efficient feature extraction is critical for minimizing regret.
arXiv Detail & Related papers (2023-05-30T16:05:44Z) - Robust Multi-Agent Pickup and Delivery with Delays [5.287544737925232]
Multi-Agent Pickup and Delivery (MAPD) is the problem of computing collision-free paths for a group of agents.
Current algorithms for MAPD do not consider many of the practical issues encountered in real applications.
We present two solution approaches that provide robustness guarantees by planning paths that limit the effects of imperfect execution.
arXiv Detail & Related papers (2023-03-30T14:42:41Z) - Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement
Learning with Actor Rectification [74.10976684469435]
offline reinforcement learning (RL) algorithms can be transferred to multi-agent settings directly.
We propose a simple yet effective method, Offline Multi-Agent RL with Actor Rectification (OMAR), to tackle this critical challenge.
OMAR significantly outperforms strong baselines with state-of-the-art performance in multi-agent continuous control benchmarks.
arXiv Detail & Related papers (2021-11-22T13:27:42Z) - The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games [67.47961797770249]
Multi-Agent PPO (MAPPO) is a multi-agent PPO variant which adopts a centralized value function.
We show that MAPPO achieves performance comparable to the state-of-the-art in three popular multi-agent testbeds.
arXiv Detail & Related papers (2021-03-02T18:59:56Z) - Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic [54.2180984002807]
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
arXiv Detail & Related papers (2020-02-24T20:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.