Cooperative Control of Mobile Robots with Stackelberg Learning
- URL: http://arxiv.org/abs/2008.00679v1
- Date: Mon, 3 Aug 2020 07:21:51 GMT
- Title: Cooperative Control of Mobile Robots with Stackelberg Learning
- Authors: Joewie J. Koh, Guohui Ding, Christoffer Heckman, Lijun Chen,
Alessandro Roncone
- Abstract summary: Multi-robot cooperation requires agents to make decisions consistent with the shared goal.
We propose a method named SLiCC: Stackelberg Learning in Cooperative Control.
- Score: 63.99843063704676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-robot cooperation requires agents to make decisions that are consistent
with the shared goal without disregarding action-specific preferences that
might arise from asymmetry in capabilities and individual objectives. To
accomplish this goal, we propose a method named SLiCC: Stackelberg Learning in
Cooperative Control. SLiCC models the problem as a partially observable
stochastic game composed of Stackelberg bimatrix games, and uses deep
reinforcement learning to obtain the payoff matrices associated with these
games. Appropriate cooperative actions are then selected with the derived
Stackelberg equilibria. Using a bi-robot cooperative object transportation
problem, we validate the performance of SLiCC against centralized multi-agent
Q-learning and demonstrate that SLiCC achieves better combined utility.
Related papers
- Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks [2.681242476043447]
We present Capability-Aware Shared Hypernetworks (CASH), a novel architecture for heterogeneous multi-agent coordination.
CASH generates sufficient diversity while maintaining sample-efficiency via soft parameter-sharing hypernetworks.
We present experiments across two heterogeneous coordination tasks and three standard learning paradigms.
arXiv Detail & Related papers (2025-01-10T15:39:39Z) - MALT: Improving Reasoning with Multi-Agent LLM Training [64.13803241218886]
We present a first step toward "Multi-agent LLM training" (MALT) on reasoning problems.
Our approach employs a sequential multi-agent setup with heterogeneous LLMs assigned specialized roles.
We evaluate our approach across MATH, GSM8k, and CQA, where MALT on Llama 3.1 8B models achieves relative improvements of 14.14%, 7.12%, and 9.40% respectively.
arXiv Detail & Related papers (2024-12-02T19:30:36Z) - Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning [57.652899266553035]
Decentralized and lifelong-adaptive multi-agent collaborative learning aims to enhance collaboration among multiple agents without a central server.
We propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs.
arXiv Detail & Related papers (2024-03-11T09:21:11Z) - Learning in Cooperative Multiagent Systems Using Cognitive and Machine
Models [1.0742675209112622]
Multi-Agent Systems (MAS) are critical for many applications requiring collaboration and coordination with humans.
One major challenge is the simultaneous learning and interaction of independent agents in dynamic environments.
We propose three variants of Multi-Agent IBL models (MAIBL)
We demonstrate that the MAIBL models exhibit faster learning and achieve better coordination in a dynamic CMOTP task with various settings of rewards compared to current MADRL models.
arXiv Detail & Related papers (2023-08-18T00:39:06Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential
Decision-Making in Multi-Agent Reinforcement Learning [17.101534531286298]
We construct a Nash-level policy model based on a conditional hypernetwork shared by all agents.
This approach allows for asymmetric training with symmetric execution, with each agent responding optimally conditioned on the decisions made by superior agents.
Experiments demonstrate that our method effectively converges to the SE policies in repeated matrix game scenarios.
arXiv Detail & Related papers (2023-04-20T14:47:54Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum
Generation [107.10235120286352]
Training general-purpose reinforcement learning agents efficiently requires automatic generation of a goal curriculum.
We propose Curriculum Self Play (CuSP), an automated goal generation framework.
We demonstrate that our method succeeds at generating an effective curricula of goals for a range of control tasks.
arXiv Detail & Related papers (2022-02-22T01:23:23Z) - Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination [0.4153433779716327]
We formalize an alternative criteria for evaluating cooperative AI, referred to as inter-algorithm cross-play.
We show that existing state-of-the-art cooperative AI algorithms, such as Other-Play and Off-Belief Learning, under-perform in this paradigm.
We propose the Any-Play learning augmentation for generalizing self-play-based algorithms to the inter-algorithm cross-play setting.
arXiv Detail & Related papers (2022-01-28T21:43:58Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.