Safe Reinforcement Learning for Real-World Engine Control
- URL: http://arxiv.org/abs/2501.16613v1
- Date: Tue, 28 Jan 2025 01:19:05 GMT
- Title: Safe Reinforcement Learning for Real-World Engine Control
- Authors: Julian Bedei, Lucas Koch, Kevin Badalian, Alexander Winkler, Patrick Schaber, Jakob Andert,
- Abstract summary: This work introduces a toolchain for applying Reinforcement Learning (RL) to safety-critical real-world environments.
RL provides a viable solution, however, safety concerns, such as excessive pressure rise rates, must be addressed.
Real-time safety monitoring based on the k-nearest neighbor algorithm is implemented, enabling safe interaction with the testbench.
- Score: 39.9074966439168
- License:
- Abstract: This work introduces a toolchain for applying Reinforcement Learning (RL), specifically the Deep Deterministic Policy Gradient (DDPG) algorithm, in safety-critical real-world environments. As an exemplary application, transient load control is demonstrated on a single-cylinder internal combustion engine testbench in Homogeneous Charge Compression Ignition (HCCI) mode, that offers high thermal efficiency and low emissions. However, HCCI poses challenges for traditional control methods due to its nonlinear, autoregressive, and stochastic nature. RL provides a viable solution, however, safety concerns, such as excessive pressure rise rates, must be addressed when applying to HCCI. A single unsuitable control input can severely damage the engine or cause misfiring and shut down. Additionally, operating limits are not known a priori and must be determined experimentally. To mitigate these risks, real-time safety monitoring based on the k-nearest neighbor algorithm is implemented, enabling safe interaction with the testbench. The feasibility of this approach is demonstrated as the RL agent learns a control policy through interaction with the testbench. A root mean square error of 0.1374 bar is achieved for the indicated mean effective pressure, comparable to neural network-based controllers from the literature. The toolchain's flexibility is further demonstrated by adapting the agent's policy to increase ethanol energy shares, promoting renewable fuel use while maintaining safety. This RL approach addresses the longstanding challenge of applying RL to safety-critical real-world environments. The developed toolchain, with its adaptability and safety mechanisms, paves the way for future applicability of RL in engine testbenches and other safety-critical settings.
Related papers
- Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems [2.126171264016785]
We propose Adaptive Regularization (RL-AR), an algorithm that enables safe RL exploration.
RL-AR performs policy combination via a "focus module," which determines the appropriate combination depending on the state.
In a series of critical control applications, we demonstrate that RL-AR not only ensures safety during training but also achieves a return competitive with the standards of model-free RL.
arXiv Detail & Related papers (2024-04-23T16:35:14Z) - Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical
Systems [15.863561935347692]
We develop provably safe and convergent reinforcement learning algorithms for control of nonlinear dynamical systems.
Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints.
We develop a single-stage, sampling-based approach to hard constraint satisfaction that learns RL controllers enjoying classical convergence guarantees.
arXiv Detail & Related papers (2024-03-06T19:39:20Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Safe and Efficient Reinforcement Learning Using
Disturbance-Observer-Based Control Barrier Functions [5.571154223075409]
This paper presents a method for safe and efficient reinforcement learning (RL) using disturbance observers (DOBs) and control barrier functions (CBFs)
Our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions.
Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning.
arXiv Detail & Related papers (2022-11-30T18:49:53Z) - Model-Based Safe Reinforcement Learning with Time-Varying State and
Control Constraints: An Application to Intelligent Vehicles [13.40143623056186]
This paper proposes a safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints.
A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely.
The proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment.
arXiv Detail & Related papers (2021-12-18T10:45:31Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.