Approximate Information States for Worst-Case Control and Learning in Uncertain Systems
- URL: http://arxiv.org/abs/2301.05089v2
- Date: Sat, 6 Apr 2024 00:50:16 GMT
- Title: Approximate Information States for Worst-Case Control and Learning in Uncertain Systems
- Authors: Aditya Dave, Nishanth Venkatesh, Andreas A. Malikopoulos,
- Abstract summary: We consider a non-stochastic model, where disturbances acting on the system take values in bounded sets with unknown distributions.
We present a general framework for decision-making in such problems by using the notion of the information state and approximate information state.
We illustrate the application of our results in control and reinforcement learning using numerical examples.
- Score: 2.7282382992043885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate discrete-time decision-making problems in uncertain systems with partially observed states. We consider a non-stochastic model, where uncontrolled disturbances acting on the system take values in bounded sets with unknown distributions. We present a general framework for decision-making in such problems by using the notion of the information state and approximate information state, and introduce conditions to identify an uncertain variable that can be used to compute an optimal strategy through a dynamic program (DP). Next, we relax these conditions and define approximate information states that can be learned from output data without knowledge of system dynamics. We use approximate information states to formulate a DP that yields a strategy with a bounded performance loss. Finally, we illustrate the application of our results in control and reinforcement learning using numerical examples.
Related papers
- Information-Theoretic State Variable Selection for Reinforcement
Learning [4.2050490361120465]
We introduce the Transfer Entropy Redundancy Criterion (TERC), an information-theoretic criterion.
TERC determines if there is textitentropy transferred from state variables to actions during training.
We define an algorithm based on TERC that provably excludes variables from the state that have no effect on the final performance of the agent.
arXiv Detail & Related papers (2024-01-21T14:51:09Z) - Worst-Case Control and Learning Using Partial Observations Over an
Infinite Time-Horizon [2.456909016197174]
Safety-critical cyber-physical systems require robust control strategies against adversarial disturbances and modeling uncertainties.
We present a framework for approximate control and learning in partially observed systems to minimize the worst-case discounted cost over an infinite time horizon.
arXiv Detail & Related papers (2023-03-28T21:40:06Z) - Explainable Data-Driven Optimization: From Context to Decision and Back
Again [76.84947521482631]
Data-driven optimization uses contextual information and machine learning algorithms to find solutions to decision problems with uncertain parameters.
We introduce a counterfactual explanation methodology tailored to explain solutions to data-driven problems.
We demonstrate our approach by explaining key problems in operations management such as inventory management and routing.
arXiv Detail & Related papers (2023-01-24T15:25:16Z) - On Leave-One-Out Conditional Mutual Information For Generalization [122.2734338600665]
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI)
Contrary to other CMI bounds, our loo-CMI bounds can be computed easily and can be interpreted in connection to other notions such as classical leave-one-out cross-validation.
We empirically validate the quality of the bound by evaluating its predicted generalization gap in scenarios for deep learning.
arXiv Detail & Related papers (2022-07-01T17:58:29Z) - A Priori Denoising Strategies for Sparse Identification of Nonlinear
Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements.
We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z) - Formal Verification of Unknown Dynamical Systems via Gaussian Process Regression [11.729744197698718]
Leveraging autonomous systems in safety-critical scenarios requires verifying their behaviors in the presence of uncertainties.
We develop a framework for verifying discrete-time dynamical systems with unmodelled dynamics and noisy measurements.
arXiv Detail & Related papers (2021-12-31T05:10:05Z) - The Impact of Data on the Stability of Learning-Based Control- Extended
Version [63.97366815968177]
We propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance.
By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions.
arXiv Detail & Related papers (2020-11-20T19:10:01Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Approximate information state for approximate planning and reinforcement
learning in partially observed systems [0.7646713951724009]
We show that if a function of the history (called approximate information state (AIS)) approximately satisfies the properties of the information state, then there is a corresponding approximate dynamic program.
We show that several approximations in state, observation and action spaces in literature can be viewed as instances of AIS.
A salient feature of AIS is that it can be learnt from data.
arXiv Detail & Related papers (2020-10-17T18:30:30Z) - Learning Robust Decision Policies from Observational Data [21.05564340986074]
It is of interest to learn robust policies that reduce the risk of outcomes with high costs.
We develop a method for learning policies that reduce tails of the cost distribution at a specified level.
arXiv Detail & Related papers (2020-06-03T16:02:57Z) - Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement
Learning [70.01650994156797]
Off- evaluation of sequential decision policies from observational data is necessary in batch reinforcement learning such as education healthcare.
We develop an approach that estimates the bounds of a given policy.
We prove convergence to the sharp bounds as we collect more confounded data.
arXiv Detail & Related papers (2020-02-11T16:18:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.