Learning to Optimize in Model Predictive Control
- URL: http://arxiv.org/abs/2212.02603v1
- Date: Mon, 5 Dec 2022 21:20:10 GMT
- Title: Learning to Optimize in Model Predictive Control
- Authors: Jacob Sacks, Byron Boots
- Abstract summary: Sampling-based Model Predictive Control (MPC) is a flexible control framework that can reason about non-smooth dynamics and cost functions.
We show that this can be particularly useful in sampling-based MPC, where we often wish to minimize the number of samples.
We show that we can contend with this noise by learning how to update the control distribution more effectively and make better use of the few samples that we have.
- Score: 36.82905770866734
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Sampling-based Model Predictive Control (MPC) is a flexible control framework
that can reason about non-smooth dynamics and cost functions. Recently,
significant work has focused on the use of machine learning to improve the
performance of MPC, often through learning or fine-tuning the dynamics or cost
function. In contrast, we focus on learning to optimize more effectively. In
other words, to improve the update rule within MPC. We show that this can be
particularly useful in sampling-based MPC, where we often wish to minimize the
number of samples for computational reasons. Unfortunately, the cost of
computational efficiency is a reduction in performance; fewer samples results
in noisier updates. We show that we can contend with this noise by learning how
to update the control distribution more effectively and make better use of the
few samples that we have. Our learned controllers are trained via imitation
learning to mimic an expert which has access to substantially more samples. We
test the efficacy of our approach on multiple simulated robotics tasks in
sample-constrained regimes and demonstrate that our approach can outperform a
MPC controller with the same number of samples.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Towards an Adaptable and Generalizable Optimization Engine in Decision
and Control: A Meta Reinforcement Learning Approach [6.302621910090619]
We propose to learn an MPC controller based on meta-reinforcement learning (RL) to update controllers.
This does not need expert demonstration and can enable fast adaptation when it is deployed in unseen control tasks.
arXiv Detail & Related papers (2024-01-04T19:41:33Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Learning Sampling Distributions for Model Predictive Control [36.82905770866734]
Sampling-based approaches to Model Predictive Control (MPC) have become a cornerstone of contemporary approaches to MPC.
We propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution.
Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time.
arXiv Detail & Related papers (2022-12-05T20:35:36Z) - Model Predictive Control via On-Policy Imitation Learning [28.96122879515294]
We develop new sample complexity results and performance guarantees for data-driven Model Predictive Control.
Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance.
arXiv Detail & Related papers (2022-10-17T16:06:06Z) - Improving the Performance of Robust Control through Event-Triggered
Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem.
We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Demonstration-Efficient Guided Policy Search via Imitation of Robust
Tube MPC [36.3065978427856]
We propose a strategy to compress a computationally expensive Model Predictive Controller (MPC) into a more computationally efficient representation based on a deep neural network and Imitation Learning (IL)
By generating a Robust Tube variant (RTMPC) of the MPC and leveraging properties from the tube, we introduce a data augmentation method that enables high demonstration-efficiency.
Our method outperforms strategies commonly employed in IL, such as DAgger and Domain Randomization, in terms of demonstration-efficiency and robustness to perturbations unseen during training.
arXiv Detail & Related papers (2021-09-21T01:50:19Z) - Blending MPC & Value Function Approximation for Efficient Reinforcement
Learning [42.429730406277315]
Model-Predictive Control (MPC) is a powerful tool for controlling complex, real-world systems.
We present a framework for improving on MPC with model-free reinforcement learning (RL)
We show that our approach can obtain performance comparable with MPC with access to true dynamics.
arXiv Detail & Related papers (2020-12-10T11:32:01Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.