Near-Optimal Multi-Agent Learning for Safe Coverage Control
- URL: http://arxiv.org/abs/2210.06380v1
- Date: Wed, 12 Oct 2022 16:33:34 GMT
- Title: Near-Optimal Multi-Agent Learning for Safe Coverage Control
- Authors: Manish Prajapat, Matteo Turchetta, Melanie N. Zeilinger, Andreas
Krause
- Abstract summary: In multi-agent coverage control problems, agents navigate their environment to reach locations that maximize the coverage of some density.
In this paper, we aim to efficiently learn the density to approximately solve the coverage problem while preserving the agents' safety.
We give first of its kind results: near optimal coverage in finite time while provably guaranteeing safety.
- Score: 76.99020416197631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-agent coverage control problems, agents navigate their environment
to reach locations that maximize the coverage of some density. In practice, the
density is rarely known $\textit{a priori}$, further complicating the original
NP-hard problem. Moreover, in many applications, agents cannot visit arbitrary
locations due to $\textit{a priori}$ unknown safety constraints. In this paper,
we aim to efficiently learn the density to approximately solve the coverage
problem while preserving the agents' safety. We first propose a conditionally
linear submodular coverage function that facilitates theoretical analysis.
Utilizing this structure, we develop MacOpt, a novel algorithm that efficiently
trades off the exploration-exploitation dilemma due to partial observability,
and show that it achieves sublinear regret. Next, we extend results on
single-agent safe exploration to our multi-agent setting and propose SafeMac
for safe coverage and exploration. We analyze SafeMac and give first of its
kind results: near optimal coverage in finite time while provably guaranteeing
safety. We extensively evaluate our algorithms on synthetic and real problems,
including a bio-diversity monitoring task under safety constraints, where
SafeMac outperforms competing methods.
Related papers
- Safe Exploration in Reinforcement Learning: A Generalized Formulation
and Algorithms [8.789204441461678]
We present a solution of the safe exploration (GSE) problem in the form of a meta-algorithm for safe exploration, MASE.
Our proposed algorithm achieves better performance than state-of-the-art algorithms on grid-world and Safety Gym benchmarks.
arXiv Detail & Related papers (2023-10-05T00:47:09Z) - Provably Learning Nash Policies in Constrained Markov Potential Games [90.87573337770293]
Multi-agent reinforcement learning (MARL) addresses sequential decision-making problems with multiple agents.
Constrained Markov Games (CMGs) are a natural formalism for safe MARL problems, though generally intractable.
arXiv Detail & Related papers (2023-06-13T13:08:31Z) - Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding.
We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Safe Policy Optimization with Local Generalized Linear Function
Approximations [17.84511819022308]
Existing safe exploration methods guaranteed safety under the assumption of regularity.
We propose a novel algorithm, SPO-LF, that optimize an agent's policy while learning the relation between a locally available feature obtained by sensors and environmental reward/safety.
We experimentally show that our algorithm is 1) more efficient in terms of sample complexity and computational cost and 2) more applicable to large-scale problems than previous safe RL methods with theoretical guarantees.
arXiv Detail & Related papers (2021-11-09T00:47:50Z) - Learning to Act Safely with Limited Exposure and Almost Sure Certainty [1.0323063834827415]
This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for exploratory trials.
We first focus on the canonical multi-armed bandit problem and seek to study the intrinsic trade-offs of learning safety in the presence of uncertainty.
arXiv Detail & Related papers (2021-05-18T18:05:12Z) - Learning to be safe, in finite time [4.189643331553922]
This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials.
We focus on the canonical multi-armed bandit problem and seek to study the exploration-preservation trade-off intrinsic within safe learning.
arXiv Detail & Related papers (2020-10-01T14:03:34Z) - Provably Safe PAC-MDP Exploration Using Analogies [87.41775218021044]
Key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration and safety.
We propose Analogous Safe-state Exploration (ASE), an algorithm for provably safe exploration in MDPs with unknown, dynamics.
Our method exploits analogies between state-action pairs to safely learn a near-optimal policy in a PAC-MDP sense.
arXiv Detail & Related papers (2020-07-07T15:50:50Z) - Provably Efficient Safe Exploration via Primal-Dual Policy Optimization [105.7510838453122]
We study the Safe Reinforcement Learning (SRL) problem using the Constrained Markov Decision Process (CMDP) formulation.
We present an provably efficient online policy optimization algorithm for CMDP with safe exploration in the function approximation setting.
arXiv Detail & Related papers (2020-03-01T17:47:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.