Imitation Learning via Differentiable Physics
- URL: http://arxiv.org/abs/2206.04873v1
- Date: Fri, 10 Jun 2022 04:54:32 GMT
- Title: Imitation Learning via Differentiable Physics
- Authors: Siwei Chen, Xiao Ma, Zhongwen Xu
- Abstract summary: Imitation learning (IL) methods such as inverse reinforcement learning (IRL) usually have a double-loop training process.
We propose a new IL method, i.e., Imitation Learning via Differentiable Physics (ILD), which gets rid of the double-loop design.
ILD achieves significant improvements in final performance, convergence speed, and stability.
- Score: 26.356669151969953
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing imitation learning (IL) methods such as inverse reinforcement
learning (IRL) usually have a double-loop training process, alternating between
learning a reward function and a policy and tend to suffer long training time
and high variance. In this work, we identify the benefits of differentiable
physics simulators and propose a new IL method, i.e., Imitation Learning via
Differentiable Physics (ILD), which gets rid of the double-loop design and
achieves significant improvements in final performance, convergence speed, and
stability. The proposed ILD incorporates the differentiable physics simulator
as a physics prior into its computational graph for policy learning. It unrolls
the dynamics by sampling actions from a parameterized policy, simply minimizing
the distance between the expert trajectory and the agent trajectory, and
back-propagating the gradient into the policy via temporal physics operators.
With the physics prior, ILD policies can not only be transferable to unseen
environment specifications but also yield higher final performance on a variety
of tasks. In addition, ILD naturally forms a single-loop structure, which
significantly improves the stability and training speed. To simplify the
complex optimization landscape induced by temporal physics operations, ILD
dynamically selects the learning objectives for each state during optimization.
In our experiments, we show that ILD outperforms state-of-the-art methods in a
variety of continuous control tasks with Brax, requiring only one expert
demonstration. In addition, ILD can be applied to challenging deformable object
manipulation tasks and can be generalized to unseen configurations.
Related papers
- Physics Informed Deep Learning for Strain Gradient Continuum Plasticity [0.0]
We use a space-time discretization based on physics informed deep learning to approximate solutions of rate-dependent strain gradient plasticity models.
Taking inspiration from physics informed neural networks, we modify the loss function of a PIDL model in several novel ways.
We show how PIDL methods could address the computational challenges posed by strain plasticity models.
arXiv Detail & Related papers (2024-08-13T06:02:05Z) - DiffMimic: Efficient Motion Mimicking with Differentiable Physics [41.442225872857136]
We leverage differentiable physics simulators (DPS) and propose an efficient motion mimicking method dubbed DiffMimic.
Our key insight is that DPS casts a complex policy learning task to a much simpler state matching problem.
Extensive experiments on standard benchmarks show that DiffMimic has a better sample efficiency and time efficiency than existing methods.
arXiv Detail & Related papers (2023-04-06T17:56:22Z) - Complex Locomotion Skill Learning via Differentiable Physics [30.868690308658174]
Differentiable physics enables efficient-based optimizations of neural network (NN) controllers.
We present a practical learning framework that outputs unified NN controllers capable of tasks with significantly improved complexity and diversity.
arXiv Detail & Related papers (2022-06-06T04:01:12Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data.
We show that a neural network can model highly nonlinear behaviors accurately for large time horizons.
In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z) - DiffSkill: Skill Abstraction from Differentiable Physics for Deformable
Object Manipulations with Tools [96.38972082580294]
DiffSkill is a novel framework that uses a differentiable physics simulator for skill abstraction to solve deformable object manipulation tasks.
In particular, we first obtain short-horizon skills using individual tools from a gradient-based simulator.
We then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input.
arXiv Detail & Related papers (2022-03-31T17:59:38Z) - Efficient Differentiable Simulation of Articulated Bodies [89.64118042429287]
We present a method for efficient differentiable simulation of articulated bodies.
This enables integration of articulated body dynamics into deep learning frameworks.
We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method.
arXiv Detail & Related papers (2021-09-16T04:48:13Z) - PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable
Physics [89.81550748680245]
We introduce a new differentiable physics benchmark called PasticineLab.
In each task, the agent uses manipulators to deform the plasticine into the desired configuration.
We evaluate several existing reinforcement learning (RL) methods and gradient-based methods on this benchmark.
arXiv Detail & Related papers (2021-04-07T17:59:23Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.