Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms
- URL: http://arxiv.org/abs/2311.18736v2
- Date: Tue, 23 Apr 2024 18:15:52 GMT
- Title: Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms
- Authors: Xiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar,
- Abstract summary: We introduce controlgym, a library of 36 industrial control settings, and ten infinite-dimensional partial differential equation (PDE)-based control problems.
controlgym is integrated within the OpenAI Gym/Gymnasium framework.
- Score: 5.7648266677851865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce controlgym, a library of thirty-six industrial control settings, and ten infinite-dimensional partial differential equation (PDE)-based control problems. Integrated within the OpenAI Gym/Gymnasium (Gym) framework, controlgym allows direct applications of standard reinforcement learning (RL) algorithms like stable-baselines3. Our control environments complement those in Gym with continuous, unbounded action and observation spaces, motivated by real-world control applications. Moreover, the PDE control environments uniquely allow the users to extend the state dimensionality of the system to infinity while preserving the intrinsic dynamics. This feature is crucial for evaluating the scalability of RL algorithms for control. This project serves the learning for dynamics & control (L4DC) community, aiming to explore key questions: the convergence of RL algorithms in learning control policies; the stability and robustness issues of learning-based controllers; and the scalability of RL algorithms to high- and potentially infinite-dimensional systems. We open-source the controlgym project at https://github.com/xiangyuan-zhang/controlgym.
Related papers
- ODRL: A Benchmark for Off-Dynamics Reinforcement Learning [59.72217833812439]
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods.
ODRL contains four experimental settings where the source and target domains can be either online or offline.
We conduct extensive benchmarking experiments, which show that no method has universal advantages across varied dynamics shifts.
arXiv Detail & Related papers (2024-10-28T05:29:38Z) - Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - A Safe Reinforcement Learning Algorithm for Supervisory Control of Power
Plants [7.1771300511732585]
Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks.
We propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control.
Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.
arXiv Detail & Related papers (2024-01-23T17:52:49Z) - CT-DQN: Control-Tutored Deep Reinforcement Learning [4.395396671038298]
Control-Tutored Deep Q-Networks (CT-DQN) is a Deep Reinforcement Learning algorithm that leverages a control tutor to reduce learning time.
We validate our approach on three scenarios from OpenAI Gym: the inverted pendulum, lunar lander, and car racing.
arXiv Detail & Related papers (2022-12-02T17:59:43Z) - Deep Reinforcement Learning with Shallow Controllers: An Experimental
Application to PID Tuning [3.9146761527401424]
We demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system.
At the core of our approach is the use of a PID controller as the trainable RL policy.
arXiv Detail & Related papers (2021-11-13T18:48:28Z) - Sparsity in Partially Controllable Linear Systems [56.142264865866636]
We study partially controllable linear dynamical systems specified by an underlying sparsity pattern.
Our results characterize those state variables which are irrelevant for optimal control.
arXiv Detail & Related papers (2021-10-12T16:41:47Z) - safe-control-gym: a Unified Benchmark Suite for Safe Learning-based
Control and Reinforcement Learning [3.9258421820410225]
We propose a new open-source benchmark suite, called safe-control-gym.
Our starting point is OpenAI's Gym API, which is one of the de facto standard in reinforcement learning research.
We show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches.
arXiv Detail & Related papers (2021-09-13T21:09:28Z) - Enforcing robust control guarantees within neural network policies [76.00287474159973]
We propose a generic nonlinear control policy class, parameterized by neural networks, that enforces the same provable robustness criteria as robust control.
We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.
arXiv Detail & Related papers (2020-11-16T17:14:59Z) - Reinforcement Learning of Structured Control for Linear Systems with
Unknown State Matrix [0.0]
We bring forth the ideas from reinforcement learning (RL) in conjunction with sufficient stability and performance guarantees.
A special control structure enabled by this RL framework is distributed learning control which is necessary for many large-scale cyber-physical systems.
arXiv Detail & Related papers (2020-11-02T17:04:34Z) - Learning a Contact-Adaptive Controller for Robust, Efficient Legged
Locomotion [95.1825179206694]
We present a framework that synthesizes robust controllers for a quadruped robot.
A high-level controller learns to choose from a set of primitives in response to changes in the environment.
A low-level controller that utilizes an established control method to robustly execute the primitives.
arXiv Detail & Related papers (2020-09-21T16:49:26Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.