Adaptive Tracking of a Single-Rigid-Body Character in Various
Environments
- URL: http://arxiv.org/abs/2308.07491v3
- Date: Sun, 28 Jan 2024 14:07:01 GMT
- Title: Adaptive Tracking of a Single-Rigid-Body Character in Various
Environments
- Authors: Taesoo Kwon, Taehong Gu, Jaewon Ahn, Yoonsang Lee
- Abstract summary: We propose a deep reinforcement learning method based on the simulation of a single-rigid-body character.
Using the centroidal dynamics model (CDM) to express the full-body character as a single rigid body (SRB) and training a policy to track a reference motion, we can obtain a policy capable of adapting to various unobserved environmental changes.
We demonstrate that our policy, efficiently trained within 30 minutes on an ultraportable laptop, has the ability to cope with environments that have not been experienced during learning.
- Score: 2.048226951354646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since the introduction of DeepMimic [Peng et al. 2018], subsequent research
has focused on expanding the repertoire of simulated motions across various
scenarios. In this study, we propose an alternative approach for this goal, a
deep reinforcement learning method based on the simulation of a
single-rigid-body character. Using the centroidal dynamics model (CDM) to
express the full-body character as a single rigid body (SRB) and training a
policy to track a reference motion, we can obtain a policy that is capable of
adapting to various unobserved environmental changes and controller transitions
without requiring any additional learning. Due to the reduced dimension of
state and action space, the learning process is sample-efficient. The final
full-body motion is kinematically generated in a physically plausible way,
based on the state of the simulated SRB character. The SRB simulation is
formulated as a quadratic programming (QP) problem, and the policy outputs an
action that allows the SRB character to follow the reference motion. We
demonstrate that our policy, efficiently trained within 30 minutes on an
ultraportable laptop, has the ability to cope with environments that have not
been experienced during learning, such as running on uneven terrain or pushing
a box, and transitions between learned policies, without any additional
learning.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.