End-to-End Stable Imitation Learning via Autonomous Neural Dynamic
Policies
- URL: http://arxiv.org/abs/2305.12886v1
- Date: Mon, 22 May 2023 10:10:23 GMT
- Title: End-to-End Stable Imitation Learning via Autonomous Neural Dynamic
Policies
- Authors: Dionis Totsila, Konstantinos Chatzilygeroudis, Denis Hadjivelichkov,
Valerio Modugno, Ioannis Hatzilygeroudis, Dimitrios Kanoulas
- Abstract summary: State-of-the-art sensorimotor learning algorithms offer policies that can often produce unstable behaviors.
Traditional robot learning relies on dynamical system-based policies that can be analyzed for stability/safety.
In this work, we bridge the gap between generic neural network policies and dynamical system-based policies.
- Score: 2.7941001040182765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art sensorimotor learning algorithms offer policies that can
often produce unstable behaviors, damaging the robot and/or the environment.
Traditional robot learning, on the contrary, relies on dynamical system-based
policies that can be analyzed for stability/safety. Such policies, however, are
neither flexible nor generic and usually work only with proprioceptive sensor
states. In this work, we bridge the gap between generic neural network policies
and dynamical system-based policies, and we introduce Autonomous Neural Dynamic
Policies (ANDPs) that: (a) are based on autonomous dynamical systems, (b)
always produce asymptotically stable behaviors, and (c) are more flexible than
traditional stable dynamical system-based policies. ANDPs are fully
differentiable, flexible generic-policies that can be used in imitation
learning setups while ensuring asymptotic stability. In this paper, we explore
the flexibility and capacity of ANDPs in several imitation learning tasks
including experiments with image observations. The results show that ANDPs
combine the benefits of both neural network-based and dynamical system-based
methods.
Related papers
- Learning Deep Dissipative Dynamics [5.862431328401459]
Dissipativity is a crucial indicator for dynamical systems that generalizes stability and input-output stability.
We propose a differentiable projection that transforms any dynamics represented by neural networks into dissipative ones.
Our method strictly guarantees stability, input-output stability, and energy conservation of trained dynamical systems.
arXiv Detail & Related papers (2024-08-21T09:44:43Z) - Neural Contractive Dynamical Systems [13.046426079291376]
Stability guarantees are crucial when ensuring a fully autonomous robot does not take undesirable or potentially harmful actions.
We propose a novel methodology to learn neural contractive dynamical systems, where our neural architecture ensures contraction.
We show that our approach encodes the desired dynamics more accurately than the current state-of-the-art, which provides less strong stability guarantees.
arXiv Detail & Related papers (2024-01-17T17:18:21Z) - Quantification before Selection: Active Dynamics Preference for Robust
Reinforcement Learning [5.720802072821204]
We introduce Active Dynamics Preference(ADP), which quantifies the informativeness and density of sampled system parameters.
We validate our approach in four robotic locomotion tasks with various discrepancies between the training and testing environments.
arXiv Detail & Related papers (2022-09-23T13:59:55Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - Hierarchical Neural Dynamic Policies [50.969565411919376]
We tackle the problem of generalization to unseen configurations for dynamic tasks in the real world while learning from high-dimensional image input.
We use hierarchical deep policy learning framework called Hierarchical Neural Dynamical Policies (H-NDPs)
H-NDPs form a curriculum by learning local dynamical system-based policies on small regions in state-space.
We show that H-NDPs are easily integrated with both imitation as well as reinforcement learning setups and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-12T17:59:58Z) - Value Iteration in Continuous Actions, States and Time [99.00362538261972]
We propose a continuous fitted value iteration (cFVI) algorithm for continuous states and actions.
The optimal policy can be derived for non-linear control-affine dynamics.
Videos of the physical system are available at urlhttps://sites.google.com/view/value-iteration.
arXiv Detail & Related papers (2021-05-10T21:40:56Z) - Structured Policy Representation: Imposing Stability in arbitrarily
conditioned dynamic systems [24.11609722217645]
We present a new family of deep neural network-based dynamic systems.
The presented dynamics are globally stable and can be conditioned with an arbitrary context state.
We show how these dynamics can be used as structured robot policies.
arXiv Detail & Related papers (2020-12-11T10:11:32Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Active Learning for Nonlinear System Identification with Guarantees [102.43355665393067]
We study a class of nonlinear dynamical systems whose state transitions depend linearly on a known feature embedding of state-action pairs.
We propose an active learning approach that achieves this by repeating three steps: trajectory planning, trajectory tracking, and re-estimation of the system from all available data.
We show that our method estimates nonlinear dynamical systems at a parametric rate, similar to the statistical rate of standard linear regression.
arXiv Detail & Related papers (2020-06-18T04:54:11Z) - Learning Stable Deep Dynamics Models [91.90131512825504]
We propose an approach for learning dynamical systems that are guaranteed to be stable over the entire state space.
We show that such learning systems are able to model simple dynamical systems and can be combined with additional deep generative models to learn complex dynamics.
arXiv Detail & Related papers (2020-01-17T00:04:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.