Complementary Meta-Reinforcement Learning for Fault-Adaptive Control
- URL: http://arxiv.org/abs/2009.12634v1
- Date: Sat, 26 Sep 2020 16:30:53 GMT
- Title: Complementary Meta-Reinforcement Learning for Fault-Adaptive Control
- Authors: Ibrahim Ahmed, Marcos Quinones-Grueiro, Gautam Biswas
- Abstract summary: Adaptive fault-tolerant control maintains degraded performance when faults occur as opposed to unsafe conditions or catastrophic events.
We present a meta-reinforcement learning approach that quickly adapts its control policy to changing conditions.
We evaluate our approach on an aircraft fuel transfer system under abrupt faults.
- Score: 1.8799681615947088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Faults are endemic to all systems. Adaptive fault-tolerant control maintains
degraded performance when faults occur as opposed to unsafe conditions or
catastrophic events. In systems with abrupt faults and strict time constraints,
it is imperative for control to adapt quickly to system changes to maintain
system operations. We present a meta-reinforcement learning approach that
quickly adapts its control policy to changing conditions. The approach builds
upon model-agnostic meta learning (MAML). The controller maintains a complement
of prior policies learned under system faults. This "library" is evaluated on a
system after a new fault to initialize the new policy. This contrasts with
MAML, where the controller derives intermediate policies anew, sampled from a
distribution of similar systems, to initialize a new policy. Our approach
improves sample efficiency of the reinforcement learning process. We evaluate
our approach on an aircraft fuel transfer system under abrupt faults.
Related papers
- Improving the Performance of Robust Control through Event-Triggered
Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem.
We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z) - Actor-Critic based Improper Reinforcement Learning [61.430513757337486]
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process.
We propose two algorithms: (1) a Policy Gradient-based approach; and (2) an algorithm that can switch between a simple Actor-Critic scheme and a Natural Actor-Critic scheme.
arXiv Detail & Related papers (2022-07-19T05:55:02Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process.
We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z) - Learning-based vs Model-free Adaptive Control of a MAV under Wind Gust [0.2770822269241973]
Navigation problems under unknown varying conditions are among the most important and well-studied problems in the control field.
Recent model-free adaptive control methods aim at removing this dependency by learning the physical characteristics of the plant directly from sensor feedback.
We propose a conceptually simple learning-based approach composed of a full state feedback controller, tuned robustly by a deep reinforcement learning framework.
arXiv Detail & Related papers (2021-01-29T10:13:56Z) - Performance-Weighed Policy Sampling for Meta-Reinforcement Learning [1.77898701462905]
Enhanced Model-Agnostic Meta-Learning (E-MAML) generates fast convergence of the policy function from a small number of training examples.
E-MAML maintains a set of policy parameters learned in the environment for previous tasks.
We apply E-MAML to developing reinforcement learning (RL)-based online fault tolerant control schemes.
arXiv Detail & Related papers (2020-12-10T23:08:38Z) - Runtime-Safety-Guided Policy Repair [13.038017178545728]
We study the problem of policy repair for learning-based control policies in safety-critical settings.
We propose to reduce or even eliminate control switching by repairing' the trained policy based on runtime data produced by the safety controller.
arXiv Detail & Related papers (2020-08-17T23:31:48Z) - Fault-Tolerant Control of Degrading Systems with On-Policy Reinforcement
Learning [1.8799681615947088]
We propose a novel adaptive reinforcement learning control approach for fault tolerant systems.
Online and offline learning are combined to improve exploration and sample efficiency.
We conduct experiments on an aircraft fuel transfer system to demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-08-10T20:42:59Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.