Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on
a Random Graph and Provable Auction-Fitted Q-learning
- URL: http://arxiv.org/abs/1905.12204v4
- Date: Mon, 14 Aug 2023 02:18:08 GMT
- Title: Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on
a Random Graph and Provable Auction-Fitted Q-learning
- Authors: Hyunwook Kang, Taehwan Kwon, Jinkyoo Park, James R. Morrison
- Abstract summary: This paper explores the possibility of near-optimally solving multi-agent, multi-task NP-hard planning problems with time-dependent rewards using a learning-based algorithm.
- Score: 24.956507498394497
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores the possibility of near-optimally solving multi-agent,
multi-task NP-hard planning problems with time-dependent rewards using a
learning-based algorithm. In particular, we consider a class of robot/machine
scheduling problems called the multi-robot reward collection problem (MRRC).
Such MRRC problems well model ride-sharing, pickup-and-delivery, and a variety
of related problems. In representing the MRRC problem as a sequential
decision-making problem, we observe that each state can be represented as an
extension of probabilistic graphical models (PGMs), which we refer to as random
PGMs. We then develop a mean-field inference method for random PGMs. We then
propose (1) an order-transferable Q-function estimator and (2) an
order-transferability-enabled auction to select a joint assignment in
polynomial time. These result in a reinforcement learning framework with at
least $1-1/e$ optimality. Experimental results on solving MRRC problems
highlight the near-optimality and transferability of the proposed methods. We
also consider identical parallel machine scheduling problems (IPMS) and minimax
multiple traveling salesman problems (minimax-mTSP).
Related papers
- Scalable Mechanism Design for Multi-Agent Path Finding [87.40027406028425]
Multi-Agent Path Finding (MAPF) involves determining paths for multiple agents to travel simultaneously and collision-free through a shared area toward given goal locations.
Finding an optimal solution is often computationally infeasible, making the use of approximate, suboptimal algorithms essential.
We introduce the problem of scalable mechanism design for MAPF and propose three strategyproof mechanisms, two of which even use approximate MAPF algorithms.
arXiv Detail & Related papers (2024-01-30T14:26:04Z) - RIGA: A Regret-Based Interactive Genetic Algorithm [14.388696798649658]
We propose an interactive genetic algorithm for solving multi-objective optimization problems under preference imprecision.
Our algorithm, called RIGA, can be applied to any multi-objective optimization problem provided that the aggregation function is linear in its parameters.
For several performance indicators (computation times, gap to optimality and number of queries), RIGA obtains better results than state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-10T13:56:15Z) - Minimizing Entropy to Discover Good Solutions to Recurrent Mixed Integer
Programs [0.0]
Current solvers for mixed-integer programming (MIP) problems are designed to perform well on a wide range of problems.
Recent works have shown that machine learning (ML) can be integrated with an MIP solver to inject domain knowledge and efficiently close the optimality gap.
This paper proposes an online solver that uses the notion of entropy to efficiently build a model with minimal training data and tuning.
arXiv Detail & Related papers (2022-02-07T18:52:56Z) - An actor-critic algorithm with policy gradients to solve the job shop
scheduling problem using deep double recurrent agents [1.3812010983144802]
We propose a deep reinforcement learning methodology for the job shop scheduling problem (JSSP)
The aim is to build up a greedy-like able to learn on some distribution of JSSP instances, different in the number of jobs and machines.
As expected, the model can generalize, to some extent, to larger problems or instances originated by a different distribution from the one used in training.
arXiv Detail & Related papers (2021-10-18T07:55:39Z) - Markov Decision Process modeled with Bandits for Sequential Decision
Making in Linear-flow [73.1896399783641]
In membership/subscriber acquisition and retention, we sometimes need to recommend marketing content for multiple pages in sequence.
We propose to formulate the problem as an MDP with Bandits where Bandits are employed to model the transition probability matrix.
We observe the proposed MDP with Bandits algorithm outperforms Q-learning with $epsilon$-greedy and decreasing $epsilon$, independent Bandits, and interaction Bandits.
arXiv Detail & Related papers (2021-07-01T03:54:36Z) - Learning to Schedule Heuristics in Branch-and-Bound [25.79025327341732]
Real-world applications typically require finding good solutions early in the search to enable fast decision-making.
We propose the first data-driven framework for schedulings in an exact MIP solver.
Compared to the default settings of a state-of-the-art academic MIP solver, we are able to reduce the average primal integral by up to 49% on a class of challenging instances.
arXiv Detail & Related papers (2021-03-18T14:49:52Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - Explanation Generation for Multi-Modal Multi-Agent Path Finding with
Optimal Resource Utilization using Answer Set Programming [1.7132914341329848]
The real-world applications of mMAPF require flexibility and explainability.
This paper introduces a method for generating explanations for queries regarding the feasibility and optimality of solutions.
arXiv Detail & Related papers (2020-08-08T18:34:34Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.