Constrained episodic reinforcement learning in concave-convex and
knapsack settings
- URL: http://arxiv.org/abs/2006.05051v2
- Date: Sun, 6 Jun 2021 03:30:29 GMT
- Title: Constrained episodic reinforcement learning in concave-convex and
knapsack settings
- Authors: Kiant\'e Brantley, Miroslav Dudik, Thodoris Lykouris, Sobhan
Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun
- Abstract summary: We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints.
Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.
- Score: 81.08055425644037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an algorithm for tabular episodic reinforcement learning with
constraints. We provide a modular analysis with strong theoretical guarantees
for settings with concave rewards and convex constraints, and for settings with
hard constraints (knapsacks). Most of the previous work in constrained
reinforcement learning is limited to linear constraints, and the remaining work
focuses on either the feasibility question or settings with a single episode.
Our experiments demonstrate that the proposed algorithm significantly
outperforms these approaches in existing constrained episodic environments.
Related papers
- Learning to Explore with Lagrangians for Bandits under Unknown Linear Constraints [8.784438985280094]
We study problems as pure exploration in multi-armed bandits with unknown linear constraints.
First, we propose a Lagrangian relaxation of the sample complexity lower bound for pure exploration under constraints.
Second, we leverage the Lagrangian lower bound and the properties of convex to propose two computationally efficient extensions of Track-and-Stop and Gamified Explorer, namely LATS and LAGEX.
arXiv Detail & Related papers (2024-10-24T15:26:14Z) - Resilient Constrained Reinforcement Learning [87.4374430686956]
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before study.
It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward training objective and the constraint satisfaction.
We propose a new constrained RL approach that searches for policy and constraint specifications together.
arXiv Detail & Related papers (2023-12-28T18:28:23Z) - The Impact of the Geometric Properties of the Constraint Set in Safe
Optimization with Bandit Feedback [5.758073912084366]
We consider a safe optimization problem with bandit feedback in which an agent sequentially chooses actions and observes responses from the environment.
We propose an algorithm for this problem, and study how the geometric properties of the constraint set impact the regret of the algorithm.
arXiv Detail & Related papers (2023-05-01T15:48:34Z) - Reinforcement Learning with Stepwise Fairness Constraints [50.538878453547966]
We introduce the study of reinforcement learning with stepwise fairness constraints.
We provide learning algorithms with strong theoretical guarantees in regard to policy optimality and fairness violation.
arXiv Detail & Related papers (2022-11-08T04:06:23Z) - Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains.
We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue.
We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z) - Concave Utility Reinforcement Learning with Zero-Constraint Violations [43.29210413964558]
We consider the problem of concave utility reinforcement learning (CURL) with convex constraints.
We propose a model-based learning algorithm that also achieves zero constraint violations.
arXiv Detail & Related papers (2021-09-12T06:13:33Z) - Conditional gradient methods for stochastically constrained convex
minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems.
The most important feature of our framework is that only a subset of the constraints is processed at each iteration.
Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z) - Structure Adaptive Algorithms for Stochastic Bandits [22.871155520200773]
We study reward maximisation in a class of structured multi-armed bandit problems.
The mean rewards of arms satisfy some given structural constraints.
We develop algorithms from instance-dependent lower-bounds using iterative saddle-point solvers.
arXiv Detail & Related papers (2020-07-02T08:59:54Z) - An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data.
In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem.
We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.