CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal
Covariance Design
- URL: http://arxiv.org/abs/2401.07369v1
- Date: Sun, 14 Jan 2024 21:10:59 GMT
- Title: CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal
Covariance Design
- Authors: Zeji Yi, Chaoyi Pan, Guanqi He, Guannan Qu, Guanya Shi
- Abstract summary: We characterize the convergence property of a widely used sampling-based Model Predictive Path Integral Control (MPPI) method.
We show that MPPI enjoys at least linear convergence rates when the optimization is quadratic, which covers time-varying LQR systems.
Our theoretical analysis directly leads to a novel sampling-based MPC algorithm, CoVo-MPC.
Empirically, CoVo-MPC significantly outperforms standard MPPI by 43-54% in both simulations and real-world quad agile control tasks.
- Score: 8.943418808959494
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sampling-based Model Predictive Control (MPC) has been a practical and
effective approach in many domains, notably model-based reinforcement learning,
thanks to its flexibility and parallelizability. Despite its appealing
empirical performance, the theoretical understanding, particularly in terms of
convergence analysis and hyperparameter tuning, remains absent. In this paper,
we characterize the convergence property of a widely used sampling-based MPC
method, Model Predictive Path Integral Control (MPPI). We show that MPPI enjoys
at least linear convergence rates when the optimization is quadratic, which
covers time-varying LQR systems. We then extend to more general nonlinear
systems. Our theoretical analysis directly leads to a novel sampling-based MPC
algorithm, CoVariance-Optimal MPC (CoVo-MPC) that optimally schedules the
sampling covariance to optimize the convergence rate. Empirically, CoVo-MPC
significantly outperforms standard MPPI by 43-54% in both simulations and
real-world quadrotor agile control tasks. Videos and Appendices are available
at \url{https://lecar-lab.github.io/CoVO-MPC/}.
Related papers
- Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling [16.112708478263745]
We present a unified framework combine the main strengths of optimization-based methods for learning.
Our approach entails embedding high-capacity, transformer-based neural network models within optimization process.
Compared to purely optimization-based approaches, results show that our approach can improve performance by up to 75%.
arXiv Detail & Related papers (2024-10-31T13:23:10Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Parameter-Adaptive Approximate MPC: Tuning Neural-Network Controllers without Retraining [50.00291020618743]
This work introduces a novel, parameter-adaptive AMPC architecture capable of online tuning without recomputing large datasets and retraining.
We showcase the effectiveness of parameter-adaptive AMPC by controlling the swing-ups of two different real cartpole systems with a severely resource-constrained microcontroller (MCU)
Taken together, these contributions represent a marked step toward the practical application of AMPC in real-world systems.
arXiv Detail & Related papers (2024-04-08T20:02:19Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Deep Model Predictive Optimization [21.22047409735362]
A major challenge in robotics is to design robust policies which enable complex and agile behaviors in the real world.
We propose Deep Model Predictive Optimization (DMPO), which learns the inner-loop of an MPC optimization algorithm directly via experience.
DMPO can outperform the best MPC algorithm by up to 27% with fewer samples and an end-to-end policy trained with MFRL by 19%.
arXiv Detail & Related papers (2023-10-06T21:11:52Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Regularization and Variance-Weighted Regression Achieves Minimax
Optimality in Linear MDPs: Theory and Practice [79.48432795639403]
Mirror descent value iteration (MDVI) is an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL)
We study MDVI with linear function approximation through its sample complexity required to identify an $varepsilon$-optimal policy.
We present Variance-Weighted Least-Squares MDVI, the first theoretical algorithm that achieves nearly minimax optimal sample complexity for infinite-horizon linear MDPs.
arXiv Detail & Related papers (2023-05-22T16:13:05Z) - Learning Sampling Distributions for Model Predictive Control [36.82905770866734]
Sampling-based approaches to Model Predictive Control (MPC) have become a cornerstone of contemporary approaches to MPC.
We propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution.
Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time.
arXiv Detail & Related papers (2022-12-05T20:35:36Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Variational Inference MPC using Normalizing Flows and
Out-of-Distribution Projection [7.195824023358536]
We propose a Model Predictive Control (MPC) method for collision-free navigation.
We learn a distribution that accounts for both the dynamics of the robot and complex obstacle geometries.
We show that FlowMPPI with projection outperforms state-of-the-art MPC baselines on both in-distribution and OOD environments.
arXiv Detail & Related papers (2022-05-10T04:43:15Z) - ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear
Dynamical Systems with Adjustable Boundary Conditions [34.44010424789202]
We present a novel LMPC algorithm, Adjustable Boundary LMPC (ABC-LMPC), which enables rapid adaptation to novel start and goal configurations.
We experimentally demonstrate that the resulting controller adapts to a variety of initial and terminal conditions on 3 continuous control tasks.
arXiv Detail & Related papers (2020-03-03T09:48:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.