Pontryagin Differentiable Programming: An End-to-End Learning and
Control Framework
- URL: http://arxiv.org/abs/1912.12970v5
- Date: Tue, 12 Jan 2021 14:01:47 GMT
- Title: Pontryagin Differentiable Programming: An End-to-End Learning and
Control Framework
- Authors: Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
- Abstract summary: The Pontryagin Differentiable Programming methodology establishes a unified framework to solve a broad class of learning and control tasks.
We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning.
We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.
- Score: 108.4560749465701
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper develops a Pontryagin Differentiable Programming (PDP)
methodology, which establishes a unified framework to solve a broad class of
learning and control tasks. The PDP distinguishes from existing methods by two
novel techniques: first, we differentiate through Pontryagin's Maximum
Principle, and this allows to obtain the analytical derivative of a trajectory
with respect to tunable parameters within an optimal control system, enabling
end-to-end learning of dynamics, policies, or/and control objective functions;
and second, we propose an auxiliary control system in the backward pass of the
PDP framework, and the output of this auxiliary control system is the
analytical derivative of the original system's trajectory with respect to the
parameters, which can be iteratively solved using standard control tools. We
investigate three learning modes of the PDP: inverse reinforcement learning,
system identification, and control/planning. We demonstrate the capability of
the PDP in each learning mode on different high-dimensional systems, including
multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered
landing.
Related papers
- Modelling, Positioning, and Deep Reinforcement Learning Path Tracking
Control of Scaled Robotic Vehicles: Design and Experimental Validation [3.807917169053206]
Scaled robotic cars are commonly equipped with a hierarchical control acthiecture that includes tasks dedicated to vehicle state estimation and control.
This paper covers both aspects by proposing (i) a federeted extended Kalman filter (FEKF) and (ii) a novel deep reinforcement learning (DRL) path tracking controller trained via an expert demonstrator.
The experimentally validated model is used for (i) supporting the design of the FEKF and (ii) serving as a digital twin for training the proposed DRL-based path tracking algorithm.
arXiv Detail & Related papers (2024-01-10T14:40:53Z) - In-Distribution Barrier Functions: Self-Supervised Policy Filters that
Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations.
Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z) - Convex Programs and Lyapunov Functions for Reinforcement Learning: A
Unified Perspective on the Analysis of Value-Based Methods [3.9391112596932243]
Value-based methods play a fundamental role in Markov decision processes (MDPs) and reinforcement learning (RL)
We present a unified control-theoretic framework for analyzing valued-based methods such as value computation (VC), value iteration (VI), and temporal difference (TD) learning.
arXiv Detail & Related papers (2022-02-14T18:32:57Z) - Policy Search for Model Predictive Control with Application to Agile
Drone Flight [56.24908013905407]
We propose a policy-search-for-model-predictive-control framework for MPC.
Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies.
Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
arXiv Detail & Related papers (2021-12-07T17:39:24Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - Deep Learning Explicit Differentiable Predictive Control Laws for
Buildings [1.4121977037543585]
We present a differentiable predictive control (DPC) methodology for learning constrained control laws for unknown nonlinear systems.
DPC poses an approximate solution to multiparametric programming problems emerging from explicit nonlinear model predictive control (MPC)
arXiv Detail & Related papers (2021-07-25T16:47:57Z) - Imitation Learning from MPC for Quadrupedal Multi-Gait Control [63.617157490920505]
We present a learning algorithm for training a single policy that imitates multiple gaits of a walking robot.
We use and extend MPC-Net, which is an Imitation Learning approach guided by Model Predictive Control.
We validate our approach on hardware and show that a single learned policy can replace its teacher to control multiple gaits.
arXiv Detail & Related papers (2021-03-26T08:48:53Z) - Extended Radial Basis Function Controller for Reinforcement Learning [3.42658286826597]
This paper proposes a hybrid reinforcement learning controller which dynamically interpolates a model-based linear controller and an arbitrary differentiable policy.
The linear controller is designed based on local linearised model knowledge, and stabilises the system in a neighbourhood about an operating point.
Learning has been done on both model-based (PILCO) and model-free (DDPG) frameworks.
arXiv Detail & Related papers (2020-09-12T20:56:48Z) - Reinforcement Learning based Design of Linear Fixed Structure
Controllers [3.131740922192114]
We present a simple finite-difference approach, based on random search, to tuning linear fixed-structure controllers.
Our algorithm operates on the entire closed-loop step response of the system and iteratively improves the PID gains towards a desired closed-loop response.
arXiv Detail & Related papers (2020-05-10T00:53:11Z) - Learning to Control PDEs with Differentiable Physics [102.36050646250871]
We present a novel hierarchical predictor-corrector scheme which enables neural networks to learn to understand and control complex nonlinear physical systems over long time frames.
We demonstrate that our method successfully develops an understanding of complex physical systems and learns to control them for tasks involving PDEs.
arXiv Detail & Related papers (2020-01-21T11:58:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.