Efficient Deep Learning of Robust, Adaptive Policies using Tube
MPC-Guided Data Augmentation
- URL: http://arxiv.org/abs/2303.15688v2
- Date: Mon, 2 Oct 2023 17:34:48 GMT
- Title: Efficient Deep Learning of Robust, Adaptive Policies using Tube
MPC-Guided Data Augmentation
- Authors: Tong Zhao, Andrea Tagliabue, Jonathan P. How
- Abstract summary: Existing robust and adaptive controllers can achieve impressive performance at the cost of heavy online onboard computations.
We extend an existing efficient Imitation Learning (IL) algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties.
- Score: 42.66792060626531
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The deployment of agile autonomous systems in challenging, unstructured
environments requires adaptation capabilities and robustness to uncertainties.
Existing robust and adaptive controllers, such as those based on model
predictive control (MPC), can achieve impressive performance at the cost of
heavy online onboard computations. Strategies that efficiently learn robust and
onboard-deployable policies from MPC have emerged, but they still lack
fundamental adaptation capabilities. In this work, we extend an existing
efficient Imitation Learning (IL) algorithm for robust policy learning from MPC
with the ability to learn policies that adapt to challenging model/environment
uncertainties. The key idea of our approach consists in modifying the IL
procedure by conditioning the policy on a learned lower-dimensional
model/environment representation that can be efficiently estimated online. We
tailor our approach to the task of learning an adaptive position and attitude
control policy to track trajectories under challenging disturbances on a
multirotor. Evaluations in simulation show that a high-quality adaptive policy
can be obtained in about $1.3$ hours. We additionally empirically demonstrate
rapid adaptation to in- and out-of-training-distribution uncertainties,
achieving a $6.1$ cm average position error under wind disturbances that
correspond to about $50\%$ of the weight of the robot, and that are $36\%$
larger than the maximum wind seen during training.
Related papers
- Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space [3.639580365066386]
We propose an adaptive adversarial coefficient framework to adjust the effect of the adversarial perturbation during training.
The appealing feature of our method is that it is simple to deploy in real-world applications and does not require accessing the simulator in advance.
The experiments in MuJoCo show that our method can improve the training stability and learn a robust policy when migrated to different test environments.
arXiv Detail & Related papers (2024-05-20T12:31:11Z) - Enabling Efficient, Reliable Real-World Reinforcement Learning with
Approximate Physics-Based Models [10.472792899267365]
We focus on developing efficient and reliable policy optimization strategies for robot learning with real-world data.
In this paper we introduce a novel policy gradient-based policy optimization framework.
We show that our approach can learn precise control strategies reliably and with only minutes of real-world data.
arXiv Detail & Related papers (2023-07-16T22:36:36Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Learning Model Predictive Controllers with Real-Time Attention for
Real-World Navigation [34.86856430694435]
We present a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints.
Our approach, called Performer-MPC, uses a learned cost function parameterized by vision context embeddings provided by Performers.
Compared with a standard MPC policy, Performer-MPC achieves >40% better goal reached in cluttered environments and >65% better on social metrics when navigating around humans.
arXiv Detail & Related papers (2022-09-22T04:57:58Z) - Policy Search for Model Predictive Control with Application to Agile
Drone Flight [56.24908013905407]
We propose a policy-search-for-model-predictive-control framework for MPC.
Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies.
Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
arXiv Detail & Related papers (2021-12-07T17:39:24Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Robustifying Reinforcement Learning Policies with $\mathcal{L}_1$
Adaptive Control [7.025818894763949]
A reinforcement learning (RL) policy could fail in a new/perturbed environment due to the existence of dynamic variations.
We propose an approach to robustifying a pre-trained non-robust RL policy with $mathcalL_1$ adaptive control.
Our approach can significantly improve the robustness of an RL policy trained in a standard (i.e., non-robust) way, either in a simulator or in the real world.
arXiv Detail & Related papers (2021-06-04T04:28:46Z) - Learning High-Level Policies for Model Predictive Control [54.00297896763184]
Model Predictive Control (MPC) provides robust solutions to robot control tasks.
We propose a self-supervised learning algorithm for learning a neural network high-level policy.
We show that our approach can handle situations that are difficult for standard MPC.
arXiv Detail & Related papers (2020-07-20T17:12:34Z) - Learning Constrained Adaptive Differentiable Predictive Control Policies
With Guarantees [1.1086440815804224]
We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems.
We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model.
arXiv Detail & Related papers (2020-04-23T14:24:44Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.