Fault-Tolerant Control of Degrading Systems with On-Policy Reinforcement
Learning
- URL: http://arxiv.org/abs/2008.04407v1
- Date: Mon, 10 Aug 2020 20:42:59 GMT
- Title: Fault-Tolerant Control of Degrading Systems with On-Policy Reinforcement
Learning
- Authors: Ibrahim Ahmed, Marcos Qui\~nones-Grueiro, Gautam Biswas
- Abstract summary: We propose a novel adaptive reinforcement learning control approach for fault tolerant systems.
Online and offline learning are combined to improve exploration and sample efficiency.
We conduct experiments on an aircraft fuel transfer system to demonstrate the effectiveness of our approach.
- Score: 1.8799681615947088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel adaptive reinforcement learning control approach for fault
tolerant control of degrading systems that is not preceded by a fault detection
and diagnosis step. Therefore, \textit{a priori} knowledge of faults that may
occur in the system is not required. The adaptive scheme combines online and
offline learning of the on-policy control method to improve exploration and
sample efficiency, while guaranteeing stable learning. The offline learning
phase is performed using a data-driven model of the system, which is frequently
updated to track the system's operating conditions. We conduct experiments on
an aircraft fuel transfer system to demonstrate the effectiveness of our
approach.
Related papers
- Online Control-Informed Learning [4.907545537403502]
This paper proposes an Online Control-Informed Learning framework to solve a broad class of learning and control tasks in real time.
By considering any robot as a tunable optimal control system, we propose an online parameter estimator based on extended Kalman filter (EKF)
The proposed method also improves robustness in learning by effectively managing noise in the data.
arXiv Detail & Related papers (2024-10-04T21:03:16Z) - Data-Driven Adversarial Online Control for Unknown Linear Systems [17.595231077524467]
We present a novel data-driven online adaptive control algorithm to address this online control problem.
Our algorithm guarantees an $tmO(T2/3)$ regret gradient bound with high probability, which matches the best-known regret bound for this problem.
arXiv Detail & Related papers (2023-08-16T04:05:22Z) - In-Distribution Barrier Functions: Self-Supervised Policy Filters that
Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations.
Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z) - A stabilizing reinforcement learning approach for sampled systems with
partially unknown models [0.0]
We suggest a method to guarantee practical stability of the system-controller closed loop in a purely online learning setting.
To achieve the claimed results, we employ techniques of classical adaptive control.
The method is tested in adaptive traction control and cruise control where it proved to significantly reduce the cost.
arXiv Detail & Related papers (2022-08-31T09:20:14Z) - Joint Differentiable Optimization and Verification for Certified
Reinforcement Learning [91.93635157885055]
In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties.
We propose a framework that jointly conducts reinforcement learning and formal verification.
arXiv Detail & Related papers (2022-01-28T16:53:56Z) - Imitation Learning of Stabilizing Policies for Nonlinear Systems [1.52292571922932]
It is shown that the methods developed for linear systems and controllers can be readily extended to controllers using sum of squares.
A projected gradient descent algorithm and an alternating direction method of algorithm are proposed ass for the stabilizing imitation learning problem.
arXiv Detail & Related papers (2021-09-22T17:27:19Z) - The Impact of Data on the Stability of Learning-Based Control- Extended
Version [63.97366815968177]
We propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance.
By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions.
arXiv Detail & Related papers (2020-11-20T19:10:01Z) - Learning Hybrid Control Barrier Functions from Data [66.37785052099423]
Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.
In particular, we assume a setting in which the system dynamics are known and in which data exhibiting safe system behavior is available.
arXiv Detail & Related papers (2020-11-08T23:55:02Z) - Complementary Meta-Reinforcement Learning for Fault-Adaptive Control [1.8799681615947088]
Adaptive fault-tolerant control maintains degraded performance when faults occur as opposed to unsafe conditions or catastrophic events.
We present a meta-reinforcement learning approach that quickly adapts its control policy to changing conditions.
We evaluate our approach on an aircraft fuel transfer system under abrupt faults.
arXiv Detail & Related papers (2020-09-26T16:30:53Z) - Anticipating the Long-Term Effect of Online Learning in Control [75.6527644813815]
AntLer is a design algorithm for learning-based control laws that anticipates learning.
We show that AntLer approximates an optimal solution arbitrarily accurately with probability one.
arXiv Detail & Related papers (2020-07-24T07:00:14Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.