Related papers: ReACT: Reinforcement Learning for Controller Parametrization using B-Spline Geometries

ReACT: Reinforcement Learning for Controller Parametrization using B-Spline Geometries

URL: http://arxiv.org/abs/2401.05251v1
Date: Wed, 10 Jan 2024 16:27:30 GMT
Title: ReACT: Reinforcement Learning for Controller Parametrization using B-Spline Geometries
Authors: Thomas Rudolf, Daniel Fl\"ogel, Tobias Sch\"urmann, Simon S\"u{\ss}, Stefan Schwab, S\"oren Hohmann
Abstract summary: This work presents a novel approach using deep reinforcement learning (DRL) with N-dimensional B-spline geometries (BSGs) We focus on the control of parameter-variant systems, a class of systems with complex behavior which depends on the operating conditions. We make the adaptation process more efficient by introducing BSGs to map the controller parameters which may depend on numerous operating conditions.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robust and performant controllers are essential for industrial applications. However, deriving controller parameters for complex and nonlinear systems is challenging and time-consuming. To facilitate automatic controller parametrization, this work presents a novel approach using deep reinforcement learning (DRL) with N-dimensional B-spline geometries (BSGs). We focus on the control of parameter-variant systems, a class of systems with complex behavior which depends on the operating conditions. For this system class, gain-scheduling control structures are widely used in applications across industries due to well-known design principles. Facilitating the expensive controller parametrization task regarding these control structures, we deploy an DRL agent. Based on control system observations, the agent autonomously decides how to adapt the controller parameters. We make the adaptation process more efficient by introducing BSGs to map the controller parameters which may depend on numerous operating conditions. To preprocess time-series data and extract a fixed-length feature vector, we use a long short-term memory (LSTM) neural networks. Furthermore, this work contributes actor regularizations that are relevant to real-world environments which differ from training. Accordingly, we apply dropout layer normalization to the actor and critic networks of the truncated quantile critic (TQC) algorithm. To show our approach's working principle and effectiveness, we train and evaluate the DRL agent on the parametrization task of an industrial control structure with parameter lookup tables.

Related papers

Communication-Control Codesign for Large-Scale Wireless Networked Control Systems [80.30532872347668]
Wireless Networked Control Systems (WNCSs) are essential to Industry 4.0, enabling flexible control in applications, such as drone swarms and autonomous robots. We propose a practical WNCS model that captures correlated dynamics among multiple control loops with spatially distributed sensors and actuators sharing limited wireless resources over multi-state Markov block-fading channels. We develop a Deep Reinforcement Learning (DRL) algorithm that efficiently handles the hybrid action space, captures communication-control correlations, and ensures robust training despite sparse cross-domain variables and floating control inputs.
arXiv Detail & Related papers (2024-10-15T06:28:21Z)
Parameter-Adaptive Approximate MPC: Tuning Neural-Network Controllers without Retraining [50.00291020618743]
This work introduces a novel, parameter-adaptive AMPC architecture capable of online tuning without recomputing large datasets and retraining. We showcase the effectiveness of parameter-adaptive AMPC by controlling the swing-ups of two different real cartpole systems with a severely resource-constrained microcontroller (MCU) Taken together, these contributions represent a marked step toward the practical application of AMPC in real-world systems.
arXiv Detail & Related papers (2024-04-08T20:02:19Z)
Decision Transformer as a Foundation Model for Partially Observable Continuous Control [5.453548045211778]
Decision Transformer (DT) architecture is used to predict optimal action based on past observations, actions, and rewards. DT exhibits remarkable zero-shot generalization abilities for completely new tasks. These findings highlight the potential of DT as a foundational controller for general control applications.
arXiv Detail & Related papers (2024-04-03T02:17:34Z)
Designing a Robust Low-Level Agnostic Controller for a Quadrotor with Actor-Critic Reinforcement Learning [0.38073142980732994]
We introduce domain randomization during the training phase of a low-level waypoint guidance controller based on Soft Actor-Critic. We show that, by introducing a certain degree of uncertainty in quadrotor dynamics during training, we can obtain a controller that is capable to perform the proposed task using a larger variation of quadrotor parameters.
arXiv Detail & Related papers (2022-10-06T14:58:19Z)
Performance-Driven Controller Tuning via Derivative-Free Reinforcement Learning [6.5158195776494]
We tackle the controller tuning problem using a novel derivative-free reinforcement learning framework. We conduct numerical experiments on two concrete examples from autonomous driving, namely, adaptive cruise control with PID controller and trajectory tracking with MPC controller. Experimental results show that the proposed method outperforms popular baselines and highlight its strong potential for controller tuning.
arXiv Detail & Related papers (2022-09-11T13:01:14Z)
On Controller Tuning with Time-Varying Bayesian Optimization [74.57758188038375]
We will use time-varying optimization (TVBO) to tune controllers online in changing environments using appropriate prior knowledge on the control objective and its changes. We propose a novel TVBO strategy using Uncertainty-Injection (UI), which incorporates the assumption of incremental and lasting changes. Our model outperforms the state-of-the-art method in TVBO, exhibiting reduced regret and fewer unstable parameter configurations.
arXiv Detail & Related papers (2022-07-22T14:54:13Z)
Steady-State Error Compensation in Reference Tracking and Disturbance Rejection Problems for Reinforcement Learning-Based Control [0.9023847175654602]
Reinforcement learning (RL) is a promising, upcoming topic in automatic control applications. Initiative action state augmentation (IASA) for actor-critic-based RL controllers is introduced. This augmentation does not require any expert knowledge, leaving the approach model free.
arXiv Detail & Related papers (2022-01-31T16:29:19Z)
Policy Search for Model Predictive Control with Application to Agile Drone Flight [56.24908013905407]
We propose a policy-search-for-model-predictive-control framework for MPC. Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies. Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
arXiv Detail & Related papers (2021-12-07T17:39:24Z)
Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion [95.1825179206694]
We present a framework that synthesizes robust controllers for a quadruped robot. A high-level controller learns to choose from a set of primitives in response to changes in the environment. A low-level controller that utilizes an established control method to robustly execute the primitives.
arXiv Detail & Related papers (2020-09-21T16:49:26Z)
Optimal PID and Antiwindup Control Design as a Reinforcement Learning Problem [3.131740922192114]
We focus on the interpretability of DRL control methods. In particular, we view linear fixed-structure controllers as shallow neural networks embedded in the actor-critic framework.
arXiv Detail & Related papers (2020-05-10T01:05:26Z)
Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs) The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.