Leveraging Reward Gradients For Reinforcement Learning in Differentiable
Physics Simulations
- URL: http://arxiv.org/abs/2203.02857v1
- Date: Sun, 6 Mar 2022 02:28:46 GMT
- Title: Leveraging Reward Gradients For Reinforcement Learning in Differentiable
Physics Simulations
- Authors: Sean Gillen and Katie Byl
- Abstract summary: In the context of reinforcement learning for control, rigid body physics simulators theoretically allow algorithms to be applied directly to analytic gradients of the reward function.
We present a novel algorithm, that is able to leverage these gradients to outperform state of art deep reinforcement learning on a set of challenging nonlinear control problems.
- Score: 11.4219428942199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, fully differentiable rigid body physics simulators have been
developed, which can be used to simulate a wide range of robotic systems. In
the context of reinforcement learning for control, these simulators
theoretically allow algorithms to be applied directly to analytic gradients of
the reward function. However, to date, these gradients have proved extremely
challenging to use, and are outclassed by algorithms using no gradient
information at all. In this work we present a novel algorithm, cross entropy
analytic policy gradients, that is able to leverage these gradients to
outperform state of art deep reinforcement learning on a set of challenging
nonlinear control problems.
Related papers
- Gradient-free online learning of subgrid-scale dynamics with neural emulators [5.283819482083864]
We propose a generic algorithm to train machine learning-based subgrid parametrizations online.
We are able to train a parametrization that recovers most of the benefits of online strategies without having to compute the gradient of the original solver.
arXiv Detail & Related papers (2023-10-30T09:46:35Z) - Improving Gradient Computation for Differentiable Physics Simulation
with Contacts [10.450509067356148]
We study differentiable rigid-body simulation with contacts.
We propose to improve gradient computation by continuous collision detection and leverage the time-of-impact (TOI)
We show that with TOI-Ve, we are able to learn an optimal control sequence that matches the analytical solution.
arXiv Detail & Related papers (2023-04-28T21:10:16Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data.
We show that a neural network can model highly nonlinear behaviors accurately for large time horizons.
In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z) - Gradients are Not All You Need [28.29420710601308]
We discuss a common chaos based failure mode which appears in a variety of differentiable circumstances.
We trace this failure to the spectrum of the Jacobian of the system under study, and provide criteria for when a practitioner might expect this failure to spoil their differentiation based optimization algorithms.
arXiv Detail & Related papers (2021-11-10T16:51:04Z) - Physical Gradients for Deep Learning [101.36788327318669]
We find that state-of-the-art training techniques are not well-suited to many problems that involve physical processes.
We propose a novel hybrid training approach that combines higher-order optimization methods with machine learning techniques.
arXiv Detail & Related papers (2021-09-30T12:14:31Z) - Efficient Differentiable Simulation of Articulated Bodies [89.64118042429287]
We present a method for efficient differentiable simulation of articulated bodies.
This enables integration of articulated body dynamics into deep learning frameworks.
We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method.
arXiv Detail & Related papers (2021-09-16T04:48:13Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable
Physics [89.81550748680245]
We introduce a new differentiable physics benchmark called PasticineLab.
In each task, the agent uses manipulators to deform the plasticine into the desired configuration.
We evaluate several existing reinforcement learning (RL) methods and gradient-based methods on this benchmark.
arXiv Detail & Related papers (2021-04-07T17:59:23Z) - Learning Unstable Dynamical Systems with Time-Weighted Logarithmic Loss [20.167719985846002]
We look into the dynamics of the gradient descent algorithm and pinpoint what causes the difficulty of learning unstable systems.
We introduce a time-weighted logarithmic loss function to fix this imbalance and demonstrate its effectiveness in learning unstable systems.
arXiv Detail & Related papers (2020-07-10T06:28:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.