Related papers: Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic

Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic

URL: http://arxiv.org/abs/2408.02960v1
Date: Tue, 6 Aug 2024 05:15:35 GMT
Title: Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic
Authors: Thomy Phan, Benran Zhang, Shao-Hung Chan, Sven Koenig,
Abstract summary: We propose Adaptive Delay-based Destroy-and-Repair with Enhanced Success-based Self-Learning (ADDRESS) as a single-destroy-heuristic variant of MAPF-LNS. We demonstrate cost improvements by at least 50% in large-scale scenarios with up to a thousand agents, compared with the original MAPF-LNS.
Score: 16.4408116214332
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Anytime multi-agent path finding (MAPF) is a promising approach to scalable path optimization in multi-agent systems. MAPF-LNS, based on Large Neighborhood Search (LNS), is the current state-of-the-art approach where a fast initial solution is iteratively optimized by destroying and repairing selected paths of the solution. Current MAPF-LNS variants commonly use an adaptive selection mechanism to choose among multiple destroy heuristics. However, to determine promising destroy heuristics, MAPF-LNS requires a considerable amount of exploration time. As common destroy heuristics are non-adaptive, any performance bottleneck caused by these heuristics cannot be overcome via adaptive heuristic selection alone, thus limiting the overall effectiveness of MAPF-LNS in terms of solution cost. In this paper, we propose Adaptive Delay-based Destroy-and-Repair Enhanced with Success-based Self-Learning (ADDRESS) as a single-destroy-heuristic variant of MAPF-LNS. ADDRESS applies restricted Thompson Sampling to the top-K set of the most delayed agents to select a seed agent for adaptive LNS neighborhood generation. We evaluate ADDRESS in multiple maps from the MAPF benchmark set and demonstrate cost improvements by at least 50% in large-scale scenarios with up to a thousand agents, compared with the original MAPF-LNS and other state-of-the-art methods.

Related papers

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models. Controlled Decoding provides a mechanism for aligning a model at inference time without retraining. We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
Loosely Synchronized Rule-Based Planning for Multi-Agent Path Finding with Asynchronous Actions [5.5233853454863615]
Multi-Agent Path Finding (MAPF) seeks collision-free paths for multiple agents from their respective starting locations to their respective goal locations. Although many MAPF algorithms can handle up to thousands of agents, they usually rely on the assumption that each action of the agent takes a time unit. This paper develops new planners that lie on the other end of the spectrum, trading off solution quality for scalability.
arXiv Detail & Related papers (2024-12-16T11:36:24Z)
Transient Multi-Agent Path Finding for Lifelong Navigation in Dense Environments [9.000023855628958]
The Lifelong MAPF (LMAPF) problem is a well-studied online version of MAPF in which an agent receives a new target when it reaches its current target. We propose to solve LMAPF problems by solving a sequence of modified MAPF problems, in which the objective is for each agent to eventually visit its target. We refer to this MAPF variant as Transient MAPF (TMAPF) and propose several algorithms for solving it based on existing MAPF algorithms.
arXiv Detail & Related papers (2024-12-05T15:37:29Z)
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences. To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model. Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z)
Scalable Mechanism Design for Multi-Agent Path Finding [87.40027406028425]
Multi-Agent Path Finding (MAPF) involves determining paths for multiple agents to travel simultaneously and collision-free through a shared area toward given goal locations. Finding an optimal solution is often computationally infeasible, making the use of approximate, suboptimal algorithms essential. We introduce the problem of scalable mechanism design for MAPF and propose three strategyproof mechanisms, two of which even use approximate MAPF algorithms.
arXiv Detail & Related papers (2024-01-30T14:26:04Z)
Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search [30.364955687049292]
State-of-the-art anytime MAPF is based on Large Neighborhood Search (LNS) We propose Bandit-based Adaptive LArge Neighborhood search Combined with Exploration (BALANCE) We empirically demonstrate cost improvements of at least 50% compared to state-of-the-art anytime MAPF in large-scale scenarios.
arXiv Detail & Related papers (2023-12-28T01:24:42Z)
Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph. In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally. We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z)
Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding [29.76466191644455]
Multi-Agent Path Finding (MAPF) is a fundamental problem in robotics that asks us to compute collision-free paths for a team of agents. We propose a new approach for MAPF where agents are guided to their destination by following congestion-avoiding paths. We evaluate the idea in two large-scale settings: one-shot MAPF, where each agent has a single destination, and lifelong MAPF, where agents are continuously assigned new destinations.
arXiv Detail & Related papers (2023-08-22T07:17:39Z)
Engineering LaCAM$^\ast$: Towards Real-Time, Large-Scale, and Near-Optimal Multi-Agent Pathfinding [12.02023514105999]
This paper addresses the challenges of real-time, large-scale, and near-optimal multi-agent pathfinding (MAPF) through enhancements to the recently proposed LaCAM* algorithm. LaCAM* is a scalable search-based algorithm that guarantees the eventual finding of optimal solutions for cumulative transition costs.
arXiv Detail & Related papers (2023-08-08T14:36:58Z)
Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation [32.80370188601152]
The paper proposes the Multi-Agent Adaptive Kalman Temporal Difference (MAK-TD) framework and its Successor Representation-based variant, referred to as the MAK-SR. The proposed MAK-TD/SR frameworks consider the continuous nature of the action-space that is associated with high dimensional multi-agent environments.
arXiv Detail & Related papers (2021-12-30T18:21:53Z)
Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations. We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z)
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification [74.10976684469435]
offline reinforcement learning (RL) algorithms can be transferred to multi-agent settings directly. We propose a simple yet effective method, Offline Multi-Agent RL with Actor Rectification (OMAR), to tackle this critical challenge. OMAR significantly outperforms strong baselines with state-of-the-art performance in multi-agent continuous control benchmarks.
arXiv Detail & Related papers (2021-11-22T13:27:42Z)
RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries. A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function. Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z)
Loosely Synchronized Search for Multi-agent Path Finding with Asynchronous Actions [10.354181009277623]
Multi-agent path finding (MAPF) determines an ensemble of collision-free paths for multiple agents between their respective start and goal locations. This article presents a natural generalization of MAPF with asynchronous actions where agents do not necessarily start and stop concurrently.
arXiv Detail & Related papers (2021-03-08T02:34:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.