Related papers: The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

URL: http://arxiv.org/abs/2305.09619v1
Date: Tue, 16 May 2023 17:13:00 GMT
Title: The Power of Learned Locally Linear Models for Nonlinear Policy Optimization
Authors: Daniel Pfrommer, Max Simchowitz, Tyler Westenbroek, Nikolai Matni, Stephen Tu
Abstract summary: This paper conducts a rigorous analysis of a simplified variant of this strategy for general nonlinear systems. We analyze an algorithm which iterates between estimating local linear models of nonlinear system dynamics and performing $mathttiLQR$-like policy updates.
Score: 26.45568696453259
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e.g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost. This paper conducts a rigorous analysis of a simplified variant of this strategy for general nonlinear systems. We analyze an algorithm which iterates between estimating local linear models of nonlinear system dynamics and performing $\mathtt{iLQR}$-like policy updates. We demonstrate that this algorithm attains sample complexity polynomial in relevant problem parameters, and, by synthesizing locally stabilizing gains, overcomes exponential dependence in problem horizon. Experimental results validate the performance of our algorithm, and compare to natural deep-learning baselines.

Related papers

Efficient identification of linear, parameter-varying, and nonlinear systems with noise models [1.6385815610837167]
We present a general system identification procedure capable of estimating a broad spectrum of state-space dynamical models. We show that for this general class of model structures, the model dynamics can be separated into a deterministic process and a noise part. We parameterize the involved nonlinear functional relations by means of artificial neural-networks (ANNs)
arXiv Detail & Related papers (2025-04-16T11:23:30Z)
The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective [55.15192437680943]
We study the sample complexity of online reinforcement learning for nonlinear dynamical systems with continuous state and action spaces. Our algorithms are likely to be useful in practice, due to their simplicity, the ability to incorporate prior knowledge, and their benign transient behavior.
arXiv Detail & Related papers (2025-01-27T10:01:28Z)
On improving generalization in a class of learning problems with the method of small parameters for weakly-controlled optimal gradient systems [0.0]
We consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term. Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems. We also provide an estimate for the rate of convergence for such approximate optimal solutions.
arXiv Detail & Related papers (2024-12-11T20:50:29Z)
Model-Agnostic Zeroth-Order Policy Optimization for Meta-Learning of Ergodic Linear Quadratic Regulators [13.343937277604892]
We study the problem of using meta-learning to deal with uncertainty and heterogeneity in ergodic linear quadratic regulators. We propose an algorithm that omits the estimation of policy Hessian, which applies to tasks of learning a set of heterogeneous but similar linear dynamic systems. We provide a convergence result for the exact gradient descent process by analyzing the boundedness and smoothness of the gradient for the meta-objective.
arXiv Detail & Related papers (2024-05-27T17:26:36Z)
Neural ODEs as Feedback Policies for Nonlinear Optimal Control [1.8514606155611764]
We use Neural ordinary differential equations (Neural ODEs) to model continuous time dynamics as differential equations parametrized with neural networks. We propose the use of a neural control policy posed as a Neural ODE to solve general nonlinear optimal control problems.
arXiv Detail & Related papers (2022-10-20T13:19:26Z)
A Priori Denoising Strategies for Sparse Identification of Nonlinear Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements. We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable. We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions. We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z)
Learning Fast Approximations of Sparse Nonlinear Regression [50.00693981886832]
In this work, we bridge the gap by introducing the Threshold Learned Iterative Shrinkage Algorithming (NLISTA) Experiments on synthetic data corroborate our theoretical results and show our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-26T11:31:08Z)
A polynomial-time algorithm for learning nonparametric causal graphs [18.739085486953698]
The analysis is model-free and does not assume linearity, additivity, independent noise, or faithfulness. We impose a condition on the residual variances that is closely related to previous work on linear models with equal variances.
arXiv Detail & Related papers (2020-06-22T02:21:53Z)
The role of optimization geometry in single neuron learning [12.891722496444036]
Recent experiments have demonstrated the choice of optimization geometry can impact generalization performance when learning expressive neural model networks. We show how the interplay between geometry and the feature geometry sets the out-of-sample leads and improves performance.
arXiv Detail & Related papers (2020-06-15T17:39:44Z)
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis [102.29671176698373]
We address the problem of policy evaluation in discounted decision processes, and provide Markov-dependent guarantees on the $ell_infty$error under a generative model. We establish both and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms.
arXiv Detail & Related papers (2020-03-16T17:15:28Z)
Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty. LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
Local Policy Optimization for Trajectory-Centric Reinforcement Learning [31.495672846638346]
A lot of robotic manipulation tasks are trajectory-centric, and thus do not require a global model or policy. We present a method for simultaneous trajectory and local stabilizing policy optimization to generate local policies for trajectory-centric model-based reinforcement learning.
arXiv Detail & Related papers (2020-01-22T15:56:00Z)
Non-linear reduced modeling of dynamical systems using kernel methods and low-rank approximation [5.935306543481018]
We propose a new efficient algorithm for data-driven reduced modeling of non-linear dynamics based on linear approximations in a kernel Hilbert space. This algorithm takes advantage of the closed-form solution of a low-rank constraint optimization problem while exploiting advantageously kernel-based computations.
arXiv Detail & Related papers (2017-10-30T13:06:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.