Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot
Task Allocation
- URL: http://arxiv.org/abs/2403.07131v1
- Date: Mon, 11 Mar 2024 19:55:08 GMT
- Title: Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot
Task Allocation
- Authors: Steve Paul, Nathan Maurer, Souma Chowdhury
- Abstract summary: This paper develops a Graph Reinforcement Learning framework to learn the robustnesss or incentives for a bipartite graph matching approach to Multi-Robot Task Allocation.
The performance of this new bigraph matching approach augmented with a GRL-derived incentive is found to be at par with the original bigraph matching approach.
- Score: 5.248564173595024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most real-world Multi-Robot Task Allocation (MRTA) problems require fast and
efficient decision-making, which is often achieved using heuristics-aided
methods such as genetic algorithms, auction-based methods, and bipartite graph
matching methods. These methods often assume a form that lends better
explainability compared to an end-to-end (learnt) neural network based policy
for MRTA. However, deriving suitable heuristics can be tedious, risky and in
some cases impractical if problems are too complex. This raises the question:
can these heuristics be learned? To this end, this paper particularly develops
a Graph Reinforcement Learning (GRL) framework to learn the heuristics or
incentives for a bipartite graph matching approach to MRTA. Specifically a
Capsule Attention policy model is used to learn how to weight task/robot
pairings (edges) in the bipartite graph that connects the set of tasks to the
set of robots. The original capsule attention network architecture is
fundamentally modified by adding encoding of robots' state graph, and two
Multihead Attention based decoders whose output are used to construct a
LogNormal distribution matrix from which positive bigraph weights can be drawn.
The performance of this new bigraph matching approach augmented with a
GRL-derived incentive is found to be at par with the original bigraph matching
approach that used expert-specified heuristics, with the former offering
notable robustness benefits. During training, the learned incentive policy is
found to get initially closer to the expert-specified incentive and then
slightly deviate from its trend.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.