Policy Search for Model Predictive Control with Application to Agile
Drone Flight
- URL: http://arxiv.org/abs/2112.03850v1
- Date: Tue, 7 Dec 2021 17:39:24 GMT
- Title: Policy Search for Model Predictive Control with Application to Agile
Drone Flight
- Authors: Yunlong Song, Davide Scaramuzza
- Abstract summary: We propose a policy-search-for-model-predictive-control framework for MPC.
Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies.
Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
- Score: 56.24908013905407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Policy Search and Model Predictive Control~(MPC) are two different paradigms
for robot control: policy search has the strength of automatically learning
complex policies using experienced data, while MPC can offer optimal control
performance using models and trajectory optimization. An open research question
is how to leverage and combine the advantages of both approaches. In this work,
we provide an answer by using policy search for automatically choosing
high-level decision variables for MPC, which leads to a novel
policy-search-for-model-predictive-control framework. Specifically, we
formulate the MPC as a parameterized controller, where the hard-to-optimize
decision variables are represented as high-level policies. Such a formulation
allows optimizing policies in a self-supervised fashion. We validate this
framework by focusing on a challenging problem in agile drone flight: flying a
quadrotor through fast-moving gates. Experiments show that our controller
achieves robust and real-time control performance in both simulation and the
real world. The proposed framework offers a new perspective for merging
learning and control.
Related papers
- Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System [0.7499722271664147]
This study conducts a comparative analysis of Model Predictive Control (MPC) and Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) algorithm, applied to a Quanser Aero 2 system.
PPO excels in rise-time and adaptability, making it a promising approach for applications requiring rapid response and adaptability.
arXiv Detail & Related papers (2024-08-28T08:35:34Z) - Tuning Legged Locomotion Controllers via Safe Bayesian Optimization [47.87675010450171]
This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms.
We leverage a model-free safe learning algorithm to automate the tuning of control gains, addressing the mismatch between the simplified model used in the control formulation and the real system.
arXiv Detail & Related papers (2023-06-12T13:10:14Z) - Efficient Domain Coverage for Vehicles with Second-Order Dynamics via
Multi-Agent Reinforcement Learning [9.939081691797858]
We present a reinforcement learning (RL) approach for the multi-agent efficient domain coverage problem involving agents with second-order dynamics.
Our proposed network architecture includes the incorporation of LSTM and self-attention, which allows the trained policy to adapt to a variable number of agents.
arXiv Detail & Related papers (2022-11-11T01:59:12Z) - Learning Model Predictive Controllers with Real-Time Attention for
Real-World Navigation [34.86856430694435]
We present a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints.
Our approach, called Performer-MPC, uses a learned cost function parameterized by vision context embeddings provided by Performers.
Compared with a standard MPC policy, Performer-MPC achieves >40% better goal reached in cluttered environments and >65% better on social metrics when navigating around humans.
arXiv Detail & Related papers (2022-09-22T04:57:58Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Imitation Learning from MPC for Quadrupedal Multi-Gait Control [63.617157490920505]
We present a learning algorithm for training a single policy that imitates multiple gaits of a walking robot.
We use and extend MPC-Net, which is an Imitation Learning approach guided by Model Predictive Control.
We validate our approach on hardware and show that a single learned policy can replace its teacher to control multiple gaits.
arXiv Detail & Related papers (2021-03-26T08:48:53Z) - Learning High-Level Policies for Model Predictive Control [54.00297896763184]
Model Predictive Control (MPC) provides robust solutions to robot control tasks.
We propose a self-supervised learning algorithm for learning a neural network high-level policy.
We show that our approach can handle situations that are difficult for standard MPC.
arXiv Detail & Related papers (2020-07-20T17:12:34Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.