Information Theoretic Regret Bounds for Online Nonlinear Control
- URL: http://arxiv.org/abs/2006.12466v1
- Date: Mon, 22 Jun 2020 17:46:48 GMT
- Title: Information Theoretic Regret Bounds for Online Nonlinear Control
- Authors: Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen
Sun
- Abstract summary: We study the problem of sequential control in an unknown, nonlinear dynamical system.
This framework yields a general setting that permits discrete and continuous control inputs as well as non-smooth, non-differentiable dynamics.
We empirically show its application to a number of nonlinear control tasks and demonstrate the benefit of exploration for learning model dynamics.
- Score: 35.534829914047336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work studies the problem of sequential control in an unknown, nonlinear
dynamical system, where we model the underlying system dynamics as an unknown
function in a known Reproducing Kernel Hilbert Space. This framework yields a
general setting that permits discrete and continuous control inputs as well as
non-smooth, non-differentiable dynamics. Our main result, the Lower
Confidence-based Continuous Control ($LC^3$) algorithm, enjoys a near-optimal
$O(\sqrt{T})$ regret bound against the optimal controller in episodic settings,
where $T$ is the number of episodes. The bound has no explicit dependence on
dimension of the system dynamics, which could be infinite, but instead only
depends on information theoretic quantities. We empirically show its
application to a number of nonlinear control tasks and demonstrate the benefit
of exploration for learning model dynamics.
Related papers
- Iterative Learning Control of Fast, Nonlinear, Oscillatory Dynamics (Preprint) [0.0]
nonlinear, chaotic, and are often too fast for active control schemes.
We develop an alternative active controls system using an iterative, trajectory-optimization and parameter-tuning approach.
We demonstrate that the controller is robust to missing information and uncontrollable parameters as long as certain requirements are met.
arXiv Detail & Related papers (2024-05-30T13:27:17Z) - Learning Control-Oriented Dynamical Structure from Data [25.316358215670274]
We discuss a state-dependent nonlinear tracking controller formulation for general nonlinear control-affine systems.
We empirically demonstrate the efficacy of learned versions of this controller in stable trajectory tracking.
arXiv Detail & Related papers (2023-02-06T02:01:38Z) - LQGNet: Hybrid Model-Based and Data-Driven Linear Quadratic Stochastic
Control [24.413595920205907]
quadratic control deals with finding an optimal control signal for a dynamical system in a setting with uncertainty.
LQGNet is a controller that leverages data to operate under partially known dynamics.
We show that LQGNet outperforms classic control by overcoming mismatched SS models.
arXiv Detail & Related papers (2022-10-23T17:59:51Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Robust Online Control with Model Misspecification [96.23493624553998]
We study online control of an unknown nonlinear dynamical system with model misspecification.
Our study focuses on robustness, which measures how much deviation from the assumed linear approximation can be tolerated.
arXiv Detail & Related papers (2021-07-16T07:04:35Z) - Learning the Linear Quadratic Regulator from Nonlinear Observations [135.66883119468707]
We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR.
In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs.
Our results constitute the first provable sample complexity guarantee for continuous control with an unknown nonlinearity in the system model and general function approximation.
arXiv Detail & Related papers (2020-10-08T07:02:47Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.