Related papers: An Auction-based Coordination Strategy for Task-Constrained Multi-Agent Stochastic Planning with Submodular Rewards

An Auction-based Coordination Strategy for Task-Constrained Multi-Agent Stochastic Planning with Submodular Rewards

URL: http://arxiv.org/abs/2212.14624v2
Date: Wed, 2 Aug 2023 11:08:57 GMT
Title: An Auction-based Coordination Strategy for Task-Constrained Multi-Agent Stochastic Planning with Submodular Rewards
Authors: Ruifan Liu, Hyo-Sang Shin, Binbin Yan, and Antonios Tsourdos
Abstract summary: Existing task coordination algorithms either ignore the process or suffer from the computational intensity. We propose a decentralized auction-based coordination strategy using a newly formulated score function. For the implementation on large-scale applications, an approximate variant of the proposed method, namely Deep Auction, is also suggested.
Score: 7.419725234099728
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In many domains such as transportation and logistics, search and rescue, or cooperative surveillance, tasks are pending to be allocated with the consideration of possible execution uncertainties. Existing task coordination algorithms either ignore the stochastic process or suffer from the computational intensity. Taking advantage of the weakly coupled feature of the problem and the opportunity for coordination in advance, we propose a decentralized auction-based coordination strategy using a newly formulated score function which is generated by forming the problem into task-constrained Markov decision processes (MDPs). The proposed method guarantees convergence and at least 50% optimality in the premise of a submodular reward function. Furthermore, for the implementation on large-scale applications, an approximate variant of the proposed method, namely Deep Auction, is also suggested with the use of neural networks, which is evasive of the troublesome for constructing MDPs. Inspired by the well-known actor-critic architecture, two Transformers are used to map observations to action probabilities and cumulative rewards respectively. Finally, we demonstrate the performance of the two proposed approaches in the context of drone deliveries, where the stochastic planning for the drone league is cast into a stochastic price-collecting Vehicle Routing Problem (VRP) with time windows. Simulation results are compared with state-of-the-art methods in terms of solution quality, planning efficiency and scalability.

Related papers

Hierarchical Task Offloading for UAV-Assisted Vehicular Edge Computing via Deep Reinforcement Learning [11.695622067301128]
We propose a dual-layer UAV-assisted edge computing architecture based on partial offloading.<n>The proposed architecture enables efficient integration and coordination of heterogeneous resources.<n>We show that the proposed approach outperforms several baselines in task completion rate, system efficiency, and convergence speed.
arXiv Detail & Related papers (2025-07-08T07:10:52Z)
On efficient computation in active inference [1.1470070927586016]
We present a novel planning algorithm for finite temporal horizons with drastically lower computational complexity. We also simplify the process of setting an appropriate target distribution for new and existing active inference planning schemes.
arXiv Detail & Related papers (2023-07-02T07:38:56Z)
Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA) SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z)
Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees [39.287388288477096]
We consider the task of estimating a structural model of dynamic decisions by a human agent based upon the observable history of implemented actions and visited states. This problem has an inherent nested structure: in the inner problem, an optimal policy for a given reward function is identified while in the outer problem, a measure of fit is maximized. We propose a single-loop estimation algorithm with finite time guarantees that is equipped to deal with high-dimensional state spaces.
arXiv Detail & Related papers (2022-10-04T00:11:38Z)
Fidelity-Guarantee Entanglement Routing in Quantum Networks [64.49733801962198]
Entanglement routing establishes remote entanglement connection between two arbitrary nodes. We propose purification-enabled entanglement routing designs to provide fidelity guarantee for multiple Source-Destination (SD) pairs in quantum networks.
arXiv Detail & Related papers (2021-11-15T14:07:22Z)
Distributed Allocation and Scheduling of Tasks with Cross-Schedule Dependencies for Heterogeneous Multi-Robot Teams [2.294915015129229]
We present a distributed task allocation and scheduling algorithm for missions where the tasks of different robots are tightly coupled with temporal and precedence constraints. An application of the planning procedure to a practical use case of a greenhouse maintained by a multi-robot system is given.
arXiv Detail & Related papers (2021-09-07T13:44:28Z)
Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent [7.6860514640178]
We propose a novel zeroth-order optimization algorithm for distributed reinforcement learning. It allows each agent to estimate its local gradient by cost evaluation independently, without use of any consensus protocol.
arXiv Detail & Related papers (2021-07-26T18:11:07Z)
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks. We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution. We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z)
An Efficient Algorithm for Deep Stochastic Contextual Bandits [10.298368632706817]
In contextual bandit problems, an agent selects an action based on certain observed context to maximize the reward over iterations. Recently there have been a few studies using a deep neural network (DNN) to predict the expected reward for an action, and is trained by a gradient based method.
arXiv Detail & Related papers (2021-04-12T16:34:43Z)
A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples. We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples. Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.