Related papers: Model Predictive Control via On-Policy Imitation Learning

Model Predictive Control via On-Policy Imitation Learning

URL: http://arxiv.org/abs/2210.09206v1
Date: Mon, 17 Oct 2022 16:06:06 GMT
Title: Model Predictive Control via On-Policy Imitation Learning
Authors: Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie
Abstract summary: We develop new sample complexity results and performance guarantees for data-driven Model Predictive Control. Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance.
Score: 28.96122879515294
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying samples from an expert. Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system. Behavior cloning, however, is a method that is known to be data inefficient and suffer from distribution shifts. As an alternative, we develop a variant of the forward training algorithm which is an on-policy imitation learning method proposed by Ross et al. (2010). Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance. We validate our results through simulations and show that the forward training algorithm is indeed superior to behavior cloning when applied to MPC.

Related papers

Bootstrapped Model Predictive Control [19.652808098339644]
We introduce Bootstrapped Model Predictive Control (BMPC), a novel algorithm that performs policy learning in a bootstrapped manner. BMPC learns a network policy by imitating an MPC expert, and in turn, uses this policy to guide the MPC process. Our method achieves superior performance over prior works on diverse continuous control tasks.
arXiv Detail & Related papers (2025-03-24T16:46:36Z)
Sample-Efficient Reinforcement Learning of Koopman eNMPC [42.72938925647165]
Reinforcement learning can be used to tune data-driven (economic) nonlinear model predictive controllers ((e)NMPCs) for optimal performance in a specific control task. We combine a model-based RL algorithm with our published method that turns Koopman (e)NMPCs into automatically differentiable policies.
arXiv Detail & Related papers (2025-03-24T15:35:16Z)
Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation. We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z)
Model-based adaptation for sample efficient transfer in reinforcement learning control of parameter-varying systems [1.8799681615947088]
We leverage ideas from model-based control to address the sample efficiency problem of reinforcement learning algorithms. We demonstrate that our approach is more sample-efficient than fine-tuning with reinforcement learning alone.
arXiv Detail & Related papers (2023-05-20T10:11:09Z)
Learning-based MPC from Big Data Using Reinforcement Learning [1.3124513975412255]
This paper presents an approach for learning Model Predictive Control (MPC) schemes directly from data using Reinforcement Learning (RL) methods. We propose to tackle this issue by using tools from RL to learn a parameterized MPC scheme directly from data in an offline fashion. Our approach derives an MPC scheme without having to solve it over the collected dataset, thereby eliminating the computational complexity of existing techniques for big data.
arXiv Detail & Related papers (2023-01-04T15:39:34Z)
Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy. In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks. We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z)
On Effective Scheduling of Model-based Reinforcement Learning [53.027698625496015]
We propose a framework named AutoMBPO to automatically schedule the real data ratio. In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance.
arXiv Detail & Related papers (2021-11-16T15:24:59Z)
Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC [36.3065978427856]
We propose a strategy to compress a computationally expensive Model Predictive Controller (MPC) into a more computationally efficient representation based on a deep neural network and Imitation Learning (IL) By generating a Robust Tube variant (RTMPC) of the MPC and leveraging properties from the tube, we introduce a data augmentation method that enables high demonstration-efficiency. Our method outperforms strategies commonly employed in IL, such as DAgger and Domain Randomization, in terms of demonstration-efficiency and robustness to perturbations unseen during training.
arXiv Detail & Related papers (2021-09-21T01:50:19Z)
DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)
Imitation Learning from MPC for Quadrupedal Multi-Gait Control [63.617157490920505]
We present a learning algorithm for training a single policy that imitates multiple gaits of a walking robot. We use and extend MPC-Net, which is an Imitation Learning approach guided by Model Predictive Control. We validate our approach on hardware and show that a single learned policy can replace its teacher to control multiple gaits.
arXiv Detail & Related papers (2021-03-26T08:48:53Z)
On Training and Evaluation of Neural Network Approaches for Model Predictive Control [9.8918553325509]
This paper is a framework for training and evaluation of Model Predictive Control (MPC) implemented using constrained neural networks. The motivation is to replace real-time optimization in safety critical feedback control systems with learnt mappings in the form of neural networks with optimization layers.
arXiv Detail & Related papers (2020-05-08T15:37:55Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.