Worst-Case Control and Learning Using Partial Observations Over an
Infinite Time-Horizon
- URL: http://arxiv.org/abs/2303.16321v2
- Date: Fri, 31 Mar 2023 21:51:33 GMT
- Title: Worst-Case Control and Learning Using Partial Observations Over an
Infinite Time-Horizon
- Authors: Aditya Dave, Ioannis Faros, Nishanth Venkatesh, and Andreas A.
Malikopoulos
- Abstract summary: Safety-critical cyber-physical systems require robust control strategies against adversarial disturbances and modeling uncertainties.
We present a framework for approximate control and learning in partially observed systems to minimize the worst-case discounted cost over an infinite time horizon.
- Score: 2.456909016197174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safety-critical cyber-physical systems require control strategies whose
worst-case performance is robust against adversarial disturbances and modeling
uncertainties. In this paper, we present a framework for approximate control
and learning in partially observed systems to minimize the worst-case
discounted cost over an infinite time horizon. We model disturbances to the
system as finite-valued uncertain variables with unknown probability
distributions. For problems with known system dynamics, we construct a dynamic
programming (DP) decomposition to compute the optimal control strategy. Our
first contribution is to define information states that improve the
computational tractability of this DP without loss of optimality. Then, we
describe a simplification for a class of problems where the incurred cost is
observable at each time instance. Our second contribution is defining an
approximate information state that can be constructed or learned directly from
observed data for problems with observable costs. We derive bounds on the
performance loss of the resulting approximate control strategy and illustrate
the effectiveness of our approach in partially observed decision-making
problems with a numerical example.
Related papers
- OCMDP: Observation-Constrained Markov Decision Process [9.13947446878397]
We tackle the challenge of simultaneously learning observation and control strategies in cost-sensitive environments.
We develop an iterative, model-free deep reinforcement learning algorithm that separates the sensing and control components of the policy.
We validate our approach on a simulated diagnostic task and a realistic healthcare environment using HeartPole.
arXiv Detail & Related papers (2024-11-11T16:04:49Z) - Non-Gaussian Uncertainty Minimization Based Control of Stochastic
Nonlinear Robotic Systems [9.088960941718]
We design a state feedback controller that minimizes deviations of the states of the system from the nominal state trajectories due to uncertainties and disturbances.
We use moments and characteristic functions to propagate uncertainties throughout the nonlinear motion model of robotic systems.
arXiv Detail & Related papers (2023-03-02T23:31:32Z) - Approximate Information States for Worst-Case Control and Learning in Uncertain Systems [2.7282382992043885]
We consider a non-stochastic model, where disturbances acting on the system take values in bounded sets with unknown distributions.
We present a general framework for decision-making in such problems by using the notion of the information state and approximate information state.
We illustrate the application of our results in control and reinforcement learning using numerical examples.
arXiv Detail & Related papers (2023-01-12T15:36:36Z) - Improving the Performance of Robust Control through Event-Triggered
Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem.
We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z) - Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations.
We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety.
We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z) - Reinforcement Learning Policies in Continuous-Time Linear Systems [0.0]
We present online policies that learn optimal actions fast by carefully randomizing the parameter estimates.
We prove sharp stability results for inexact system dynamics and tightly specify the infinitesimal regret caused by sub-optimal actions.
Our analysis sheds light on fundamental challenges in continuous-time reinforcement learning and suggests a useful cornerstone for similar problems.
arXiv Detail & Related papers (2021-09-16T00:08:50Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Anticipating the Long-Term Effect of Online Learning in Control [75.6527644813815]
AntLer is a design algorithm for learning-based control laws that anticipates learning.
We show that AntLer approximates an optimal solution arbitrarily accurately with probability one.
arXiv Detail & Related papers (2020-07-24T07:00:14Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.