Related papers: Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

URL: http://arxiv.org/abs/2311.14153v2
Date: Mon, 26 Feb 2024 16:10:00 GMT
Title: Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs
Authors: Andrea Tagliabue, Jonathan P. How
Abstract summary: Imitation learning (IL) can train computationally-efficient sensorimotor policies from a resource-intensive Model Predictive Controller (MPC) We propose a data augmentation (DA) strategy that enables efficient learning of vision-based policies. We show 80-fold increase in demonstration efficiency and a 50% reduction in training time over current IL methods.
Score: 42.220568722735095
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Imitation learning (IL) can train computationally-efficient sensorimotor policies from a resource-intensive Model Predictive Controller (MPC), but it often requires many samples, leading to long training times or limited robustness. To address these issues, we combine IL with a variant of robust MPC that accounts for process and sensing uncertainties, and we design a data augmentation (DA) strategy that enables efficient learning of vision-based policies. The proposed DA method, named Tube-NeRF, leverages Neural Radiance Fields (NeRFs) to generate novel synthetic images, and uses properties of the robust MPC (the tube) to select relevant views and to efficiently compute the corresponding actions. We tailor our approach to the task of localization and trajectory tracking on a multirotor, by learning a visuomotor policy that generates control actions using images from the onboard camera as only source of horizontal position. Numerical evaluations show 80-fold increase in demonstration efficiency and a 50% reduction in training time over current IL methods. Additionally, our policies successfully transfer to a real multirotor, achieving low tracking errors despite large disturbances, with an onboard inference time of only 1.5 ms. Video: https://youtu.be/_W5z33ZK1m4

Related papers

Bootstrapped Model Predictive Control [19.652808098339644]
We introduce Bootstrapped Model Predictive Control (BMPC), a novel algorithm that performs policy learning in a bootstrapped manner. BMPC learns a network policy by imitating an MPC expert, and in turn, uses this policy to guide the MPC process. Our method achieves superior performance over prior works on diverse continuous control tasks.
arXiv Detail & Related papers (2025-03-24T16:46:36Z)
Toward Scalable Multirobot Control: Fast Policy Learning in Distributed MPC [22.644778818620185]
This article proposes a novel distributed learning-based predictive control (DLPC) framework for scalable multirobot control. Unlike conventional DMPC methods that calculate open-loop control sequences, our approach generates explicit closed-loop DMPC policies for MRS without using numerical solvers. The learned control policies could be deployed online to MRS with varying robot scales, enhancing scalability and transferability for large-scale MRS.
arXiv Detail & Related papers (2024-12-27T14:31:52Z)
SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning [11.304750795377657]
We propose SHIRE, a framework for encoding human intuition using Probabilistic Graphical Models (PGMs) SHIRE achieves 25-78% sample efficiency gains across the environments we evaluate at negligible overhead cost.
arXiv Detail & Related papers (2024-09-16T04:46:22Z)
Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression [53.33734159983431]
This paper introduces a novel approach to distill neural RL policies into more interpretable forms. We train expert neural network policies using RL and distill them into (i) GBMs, (ii) EBMs, and (iii) symbolic policies.
arXiv Detail & Related papers (2024-03-21T11:54:45Z)
Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient Sensorimotor Policy Learning [49.05174527668836]
Imitation learning (IL) can generate computationally efficient sensorimotor policies from demonstrations provided by computationally expensive model-based sensing and control algorithms. In this work, we combine IL with an output feedback robust tube model predictive controller to co-generate demonstrations and a data augmentation strategy to efficiently learn neural network-based sensorimotor policies. We numerically demonstrate that our method can learn a robust visuomotor policy from a single demonstration--a two-orders of magnitude improvement in demonstration efficiency compared to existing IL methods.
arXiv Detail & Related papers (2022-10-18T19:59:17Z)
Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC [0.0]
We present an approach for agile and computationally efficient trajectory tracking on the MIT SoftFly, a sub-gram MAV (0.7 grams) Our strategy employs a cascaded control scheme, where an adaptive attitude controller is combined with a neural network policy trained to imitate a trajectory tracking robust tube model predictive controller (RTMPC) We experimentally evaluate our approach, achieving position Root Mean Square Errors lower than 1.8 cm even in the more challenging maneuvers, obtaining a 60% reduction in maximum position error compared to our previous work, and robustness demonstrating to large external disturbances.
arXiv Detail & Related papers (2022-09-20T21:30:16Z)
Evaluating model-based planning and planner amortization for continuous control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning. We find that well-tuned model-free agents are strong baselines even for high DoF control problems. We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z)
Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC [36.3065978427856]
We propose a strategy to compress a computationally expensive Model Predictive Controller (MPC) into a more computationally efficient representation based on a deep neural network and Imitation Learning (IL) By generating a Robust Tube variant (RTMPC) of the MPC and leveraging properties from the tube, we introduce a data augmentation method that enables high demonstration-efficiency. Our method outperforms strategies commonly employed in IL, such as DAgger and Domain Randomization, in terms of demonstration-efficiency and robustness to perturbations unseen during training.
arXiv Detail & Related papers (2021-09-21T01:50:19Z)
A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels. We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
Hyperparameter Auto-tuning in Self-Supervised Robotic Learning [12.193817049957733]
Insufficient learning (due to convergence to local optima) results in under-performing policies whilst redundant learning wastes time and resources. We propose an auto-tuning technique based on the Evidence Lower Bound (ELBO) for self-supervised reinforcement learning. Our method can auto-tune online and yields the best performance at a fraction of the time and computational resources.
arXiv Detail & Related papers (2020-10-16T08:58:24Z)
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience. Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.