On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems
- URL: http://arxiv.org/abs/2306.13928v2
- Date: Wed, 26 Jun 2024 17:59:37 GMT
- Title: On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems
- Authors: Emiland Garrabe, Hozefa Jesawada, Carmen Del Vecchio, Giovanni Russo,
- Abstract summary: This paper is concerned with a finite-horizon inverse control problem, which has the goal of reconstructing, from observations, the possibly non-nonstationary cost driving the actions of an agent.
We present a result enabling cost optimization by solving an problem that is convex non-stationary agent cost.
All experiments confirm the effectiveness of our approach.
- Score: 0.7240153598817866
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper is concerned with a finite-horizon inverse control problem, which has the goal of reconstructing, from observations, the possibly non-convex and non-stationary cost driving the actions of an agent. In this context, we present a result enabling cost reconstruction by solving an optimization problem that is convex even when the agent cost is not and when the underlying dynamics is nonlinear, non-stationary and stochastic. To obtain this result, we also study a finite-horizon forward control problem that has randomized policies as decision variables. We turn our findings into algorithmic procedures and show the effectiveness of our approach via in-silico and hardware validations. All experiments confirm the effectiveness of our approach.
Related papers
- Data-Driven Optimal Feedback Laws via Kernel Mean Embeddings [3.007066256364399]
We introduce kernel mean embeddings (KMEs) to identify the Markov transition operators associated with controlled diffusion processes.
Unlike traditional dynamic programming methods, our approach exploits the Kernel trick'' to break the curse of dimensionality.
We demonstrate the effectiveness of our method through numerical examples, highlighting its ability to solve a large class of nonlinear optimal control problems.
arXiv Detail & Related papers (2024-07-23T11:53:03Z) - OTClean: Data Cleaning for Conditional Independence Violations using
Optimal Transport [51.6416022358349]
sys is a framework that harnesses optimal transport theory for data repair under Conditional Independence (CI) constraints.
We develop an iterative algorithm inspired by Sinkhorn's matrix scaling algorithm, which efficiently addresses high-dimensional and large-scale data.
arXiv Detail & Related papers (2024-03-04T18:23:55Z) - Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous
Unobserved Confounders [16.193776814471768]
We study robust policy evaluation and policy optimization in the presence of sequentially-exogenous unobserved confounders.
We provide sample complexity bounds, insights, and show effectiveness both in simulations and on real-world longitudinal healthcare data of treating sepsis.
arXiv Detail & Related papers (2023-02-01T18:40:53Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Rate-Optimal Online Convex Optimization in Adaptive Linear Control [0.0]
We consider the problem of controlling an unknown convex linear system under adversarially changing costs.
We present the first computationally-gret that attains an optimal linear hindsight function.
arXiv Detail & Related papers (2022-06-03T07:32:11Z) - Efficient Online Linear Control with Stochastic Convex Costs and Unknown
Dynamics [0.0]
We present a computationally efficient algorithm that attains an optimal $sqrtT$ regret-rate against the best stabilizing linear controller.
In contrast to previous work, our algorithm is based on the Optimism in the Face of Uncertainty paradigm.
arXiv Detail & Related papers (2022-03-02T15:19:20Z) - Distributionally Robust Optimization with Markovian Data [8.126833795693699]
We study a program where the probability distribution of the uncertain problem parameters is unknown.
We propose a data-driven distributionally to estimate the problem's objective function and optimal solution.
arXiv Detail & Related papers (2021-06-12T10:59:02Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - High-Dimensional Robust Mean Estimation via Gradient Descent [73.61354272612752]
We show that the problem of robust mean estimation in the presence of a constant adversarial fraction can be solved by gradient descent.
Our work establishes an intriguing connection between the near non-lemma estimation and robust statistics.
arXiv Detail & Related papers (2020-05-04T10:48:04Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.