Neural Dynamic Policies for End-to-End Sensorimotor Learning
- URL: http://arxiv.org/abs/2012.02788v1
- Date: Fri, 4 Dec 2020 18:59:32 GMT
- Title: Neural Dynamic Policies for End-to-End Sensorimotor Learning
- Authors: Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak
- Abstract summary: The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
- Score: 51.24542903398335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The current dominant paradigm in sensorimotor control, whether imitation or
reinforcement learning, is to train policies directly in raw action spaces such
as torque, joint angle, or end-effector position. This forces the agent to make
decisions individually at each timestep in training, and hence, limits the
scalability to continuous, high-dimensional, and long-horizon tasks. In
contrast, research in classical robotics has, for a long time, exploited
dynamical systems as a policy representation to learn robot behaviors via
demonstrations. These techniques, however, lack the flexibility and
generalizability provided by deep learning or reinforcement learning and have
remained under-explored in such settings. In this work, we begin to close this
gap and embed the structure of a dynamical system into deep neural
network-based policies by reparameterizing action spaces via second-order
differential equations. We propose Neural Dynamic Policies (NDPs) that make
predictions in trajectory distribution space as opposed to prior policy
learning methods where actions represent the raw control space. The embedded
structure allows end-to-end policy learning for both reinforcement and
imitation learning setups. We show that NDPs outperform the prior
state-of-the-art in terms of either efficiency or performance across several
robotic control tasks for both imitation and reinforcement learning setups.
Project video and code are available at
https://shikharbahl.github.io/neural-dynamic-policies/
Related papers
- Nonprehensile Planar Manipulation through Reinforcement Learning with
Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers.
We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies.
We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z) - Lie Group Forced Variational Integrator Networks for Learning and
Control of Robot Systems [14.748599534387688]
We introduce a new structure-preserving deep learning architecture capable of learning controlled Lagrangian or Hamiltonian dynamics on Lie groups.
LieFVINs preserve both the Lie group structure on which the dynamics evolve and the symplectic structure underlying the Hamiltonian or Lagrangian systems of interest.
arXiv Detail & Related papers (2022-11-29T08:14:05Z) - Learning Deep Sensorimotor Policies for Vision-based Autonomous Drone
Racing [52.50284630866713]
Existing systems often require hand-engineered components for state estimation, planning, and control.
This paper tackles the vision-based autonomous-drone-racing problem by learning deep sensorimotor policies.
arXiv Detail & Related papers (2022-10-26T19:03:17Z) - Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and
Heuristic Rule-based Methods for Object Manipulation [118.27432851053335]
This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track.
The No Interaction track targets for learning policies from pre-collected demonstration trajectories.
In this track, we design a Heuristic Rule-based Method (HRM) to trigger high-quality object manipulation by decomposing the task into a series of sub-tasks.
For each sub-task, the simple rule-based controlling strategies are adopted to predict actions that can be applied to robotic arms.
arXiv Detail & Related papers (2022-06-13T16:20:42Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - An Adaptable Approach to Learn Realistic Legged Locomotion without
Examples [38.81854337592694]
This work proposes a generic approach for ensuring realism in locomotion by guiding the learning process with the spring-loaded inverted pendulum model as a reference.
We present experimental results showing that even in a model-free setup, the learned policies can generate realistic and energy-efficient locomotion gaits for a bipedal and a quadrupedal robot.
arXiv Detail & Related papers (2021-10-28T10:14:47Z) - Hierarchical Neural Dynamic Policies [50.969565411919376]
We tackle the problem of generalization to unseen configurations for dynamic tasks in the real world while learning from high-dimensional image input.
We use hierarchical deep policy learning framework called Hierarchical Neural Dynamical Policies (H-NDPs)
H-NDPs form a curriculum by learning local dynamical system-based policies on small regions in state-space.
We show that H-NDPs are easily integrated with both imitation as well as reinforcement learning setups and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-12T17:59:58Z) - Deep Imitation Learning for Bimanual Robotic Manipulation [70.56142804957187]
We present a deep imitation learning framework for robotic bimanual manipulation.
A core challenge is to generalize the manipulation skills to objects in different locations.
We propose to (i) decompose the multi-modal dynamics into elemental movement primitives, (ii) parameterize each primitive using a recurrent graph neural network to capture interactions, and (iii) integrate a high-level planner that composes primitives sequentially and a low-level controller to combine primitive dynamics and inverse kinematics control.
arXiv Detail & Related papers (2020-10-11T01:40:03Z) - On Simple Reactive Neural Networks for Behaviour-Based Reinforcement
Learning [5.482532589225552]
We present a behaviour-based reinforcement learning approach, inspired by Brook's subsumption architecture.
Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer.
Our approach learns the pick and place task in 8,000 episodes, which represents a drastic reduction in the number of training episodes required by an end-to-end approach.
arXiv Detail & Related papers (2020-01-22T11:49:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.