Related papers: Density Constrained Reinforcement Learning

Density Constrained Reinforcement Learning

URL: http://arxiv.org/abs/2106.12764v1
Date: Thu, 24 Jun 2021 04:22:03 GMT
Title: Density Constrained Reinforcement Learning
Authors: Zengyi Qin, Yuxiao Chen, Chuchu Fan
Abstract summary: We study constrained reinforcement learning from a novel perspective by setting constraints directly on state density functions. We leverage the duality between density functions and Q functions to develop an effective algorithm to solve the density constrained RL problem optimally. We prove that the proposed algorithm converges to a near-optimal solution with a bounded error even when the policy update is imperfect.
Score: 9.23225507471139
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous works. State density has a clear physical and mathematical interpretation, and is able to express a wide variety of constraints such as resource limits and safety requirements. Density constraints can also avoid the time-consuming process of designing and tuning cost functions required by value function-based constraints to encode system specifications. We leverage the duality between density functions and Q functions to develop an effective algorithm to solve the density constrained RL problem optimally and the constrains are guaranteed to be satisfied. We prove that the proposed algorithm converges to a near-optimal solution with a bounded error even when the policy update is imperfect. We use a set of comprehensive experiments to demonstrate the advantages of our approach over state-of-the-art CRL methods, with a wide range of density constrained tasks as well as standard CRL benchmarks such as Safety-Gym.

Related papers

Hamiltonian Theory and Computation of Optimal Probability Density Control in High Dimensions [1.9534129819019077]
We establish the Pontryagin Maximum Principle (PMP) for optimal density control and construct the Hamilton-Jacobi-Bellman equation of the value functional.<n>We propose to use reduced-order models, such as deep neural networks (DNNs), to parameterize the control vector-field and the adjoint function.<n> Numerical results demonstrate promising performances of our algorithm on a variety of density control problems with obstacles and nonlinear interaction challenges in high dimensions.
arXiv Detail & Related papers (2025-05-23T20:41:37Z)
Constrained Online Decision-Making: A Unified Framework [14.465944215100746]
We investigate a general formulation of sequential decision-making with stage-wise feasibility constraints.<n>We propose a unified algorithmic framework that captures many existing constrained learning problems.<n>Our result offers a principled foundation for constrained sequential decision-making in both theory and practice.
arXiv Detail & Related papers (2025-05-11T19:22:04Z)
OTClean: Data Cleaning for Conditional Independence Violations using Optimal Transport [51.6416022358349]
sys is a framework that harnesses optimal transport theory for data repair under Conditional Independence (CI) constraints. We develop an iterative algorithm inspired by Sinkhorn's matrix scaling algorithm, which efficiently addresses high-dimensional and large-scale data.
arXiv Detail & Related papers (2024-03-04T18:23:55Z)
Resilient Constrained Reinforcement Learning [87.4374430686956]
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before study. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward training objective and the constraint satisfaction. We propose a new constrained RL approach that searches for policy and constraint specifications together.
arXiv Detail & Related papers (2023-12-28T18:28:23Z)
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage [100.8180383245813]
We propose value-based algorithms for offline reinforcement learning (RL) We show an analogous result for vanilla Q-functions under a soft margin condition. Our algorithms' loss functions arise from casting the estimation problems as nonlinear convex optimization problems and Lagrangifying.
arXiv Detail & Related papers (2023-02-05T14:22:41Z)
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian [18.2080757218886]
offline reinforcement learning (RL) refers to decision-making from a previously-collected dataset of interactions. We present the first set of offline RL algorithms that are statistically optimal and practical under general function approximation and single-policy concentrability.
arXiv Detail & Related papers (2022-11-01T19:28:48Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Reachability Constrained Reinforcement Learning [6.5158195776494]
This paper proposes a reachability CRL (RCRL) method by using reachability analysis to characterize the largest feasible sets. We also use the multi-time scale approximation theory to prove that the proposed algorithm converges to a local optimum. Empirical results on different benchmarks such as safe-control-gym and Safety-Gym validate the learned feasible set, the performance in optimal criteria, and constraint satisfaction of RCRL.
arXiv Detail & Related papers (2022-05-16T09:32:45Z)
Constrained Model-Free Reinforcement Learning for Process Optimization [0.0]
Reinforcement learning (RL) is a control approach that can handle nonlinear optimal control problems. Despite the promise exhibited, RL has yet to see marked translation to industrial practice. We propose an 'oracle'-assisted constrained Q-learning algorithm that guarantees the satisfaction of joint chance constraints with a high probability.
arXiv Detail & Related papers (2020-11-16T13:16:22Z)
Robust Reinforcement Learning with Wasserstein Constraint [49.86490922809473]
We show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm. The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.
arXiv Detail & Related papers (2020-06-01T13:48:59Z)
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO) We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.