Efficient Planning in Combinatorial Action Spaces with Applications to
Cooperative Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2302.04376v1
- Date: Wed, 8 Feb 2023 23:42:49 GMT
- Title: Efficient Planning in Combinatorial Action Spaces with Applications to
Cooperative Multi-Agent Reinforcement Learning
- Authors: Volodymyr Tkachuk, Seyed Alireza Bakhtiari, Johannes Kirschner, Matej
Jusup, Ilija Bogunovic, Csaba Szepesv\'ari
- Abstract summary: In cooperative multi-agent reinforcement learning, a potentially large number of agents jointly optimize a global reward function, which leads to a blow-up in the action space by the number of agents.
As a minimal requirement, we assume access to an argmax oracle that allows to efficiently compute the greedy policy for any Q-function in the model class.
We propose efficient algorithms for this setting that lead to compute and query complexity in all relevant problem parameters.
- Score: 16.844525262228103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A practical challenge in reinforcement learning are combinatorial action
spaces that make planning computationally demanding. For example, in
cooperative multi-agent reinforcement learning, a potentially large number of
agents jointly optimize a global reward function, which leads to a
combinatorial blow-up in the action space by the number of agents. As a minimal
requirement, we assume access to an argmax oracle that allows to efficiently
compute the greedy policy for any Q-function in the model class. Building on
recent work in planning with local access to a simulator and linear function
approximation, we propose efficient algorithms for this setting that lead to
polynomial compute and query complexity in all relevant problem parameters. For
the special case where the feature decomposition is additive, we further
improve the bounds and extend the results to the kernelized setting with an
efficient algorithm.
Related papers
- PARCO: Learning Parallel Autoregressive Policies for Efficient Multi-Agent Combinatorial Optimization [17.392822956504848]
This paper introduces PARCO, a novel approach that learns fast surrogate solvers for multi-agent problems with reinforcement learning.
We propose a model with a Multiple Pointer Mechanism to efficiently decode multiple decisions simultaneously by different agents, enhanced by a Priority-based Conflict Handling scheme.
arXiv Detail & Related papers (2024-09-05T17:49:18Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Reinforcement Learning with Combinatorial Actions: An Application to
Vehicle Routing [9.995347522610674]
We develop a framework for value-function-based deep reinforcement learning with a reinforcement action space.
We present an application of this framework to the capacitated vehicle routing problem (CVRP)
On each instance, we model an action as the construction of a single route, and consider a deterministic policy which is improved through a simple policy algorithm.
arXiv Detail & Related papers (2020-10-22T19:32:21Z) - Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot
Transfer [71.44215606325005]
We propose a novel framework called Jump-Operator Dynamic Programming for quickly computing solutions within a super-exponential space of sequential sub-goal tasks.
This approach involves controlling over an ensemble of reusable goal-conditioned polices functioning as temporally extended actions.
We then identify classes of objective functions on this subspace whose solutions are invariant to the grounding, resulting in optimal zero-shot transfer.
arXiv Detail & Related papers (2020-07-06T05:13:20Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - A Novel Multi-Agent System for Complex Scheduling Problems [2.294014185517203]
This paper is the conception and implementation of a multi-agent system that is applicable in various problem domains.
We simulate a NP-hard scheduling problem to demonstrate the validity of our approach.
This paper highlights the advantages of the agent-based approach, like the reduction in layout complexity, improved control of complicated systems, and extendability.
arXiv Detail & Related papers (2020-04-20T14:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.