Learning Task-Agnostic Action Spaces for Movement Optimization
- URL: http://arxiv.org/abs/2009.10337v2
- Date: Fri, 23 Jul 2021 13:48:13 GMT
- Title: Learning Task-Agnostic Action Spaces for Movement Optimization
- Authors: Amin Babadi, Michiel van de Panne, C. Karen Liu, Perttu
H\"am\"al\"ainen
- Abstract summary: We propose a novel method for exploring the dynamics of physically based animated characters.
We parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets.
- Score: 18.37812596641983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel method for exploring the dynamics of physically based
animated characters, and learning a task-agnostic action space that makes
movement optimization easier. Like several previous papers, we parameterize
actions as target states, and learn a short-horizon goal-conditioned low-level
control policy that drives the agent's state towards the targets. Our novel
contribution is that with our exploration data, we are able to learn the
low-level policy in a generic manner and without any reference movement data.
Trained once for each agent or simulation environment, the policy improves the
efficiency of optimizing both trajectories and high-level policies across
multiple tasks and optimization algorithms. We also contribute novel
visualizations that show how using target states as actions makes optimized
trajectories more robust to disturbances; this manifests as wider optima that
are easy to find. Due to its simplicity and generality, our proposed approach
should provide a building block that can improve a large variety of movement
optimization methods and applications.
Related papers
- MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning [99.09906827676748]
We introduce MotionRL, the first approach to utilize Multi-Reward Reinforcement Learning (RL) for optimizing text-to-motion generation tasks.
Our novel approach uses reinforcement learning to fine-tune the motion generator based on human preferences prior knowledge of the human perception model.
In addition, MotionRL introduces a novel multi-objective optimization strategy to approximate optimality between text adherence, motion quality, and human preferences.
arXiv Detail & Related papers (2024-10-09T03:27:14Z) - Extremum-Seeking Action Selection for Accelerating Policy Optimization [18.162794442835413]
Reinforcement learning for control over continuous spaces typically uses high-entropy policies, such as Gaussian distributions, for local exploration and estimating policy to optimize performance.
We propose to improve action selection in this model-free RL setting by introducing additional adaptive control steps based on Extremum-Seeking Control (ESC)
Our methods can be easily added in standard policy optimization to improve learning efficiency, which we demonstrate in various control learning environments.
arXiv Detail & Related papers (2024-04-02T02:39:17Z) - Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers [108.72225067368592]
We propose a novel perspective to investigate the design of large language models (LLMs)-based prompts.
We identify two pivotal factors in model parameter learning: update direction and update method.
In particular, we borrow the theoretical framework and learning methods from gradient-based optimization to design improved strategies.
arXiv Detail & Related papers (2024-02-27T15:05:32Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Acceleration in Policy Optimization [50.323182853069184]
We work towards a unifying paradigm for accelerating policy optimization methods in reinforcement learning (RL) by integrating foresight in the policy improvement step via optimistic and adaptive updates.
We define optimism as predictive modelling of the future behavior of a policy, and adaptivity as taking immediate and anticipatory corrective actions to mitigate errors from overshooting predictions or delayed responses to change.
We design an optimistic policy gradient algorithm, adaptive via meta-gradient learning, and empirically highlight several design choices pertaining to acceleration, in an illustrative task.
arXiv Detail & Related papers (2023-06-18T15:50:57Z) - Hierarchical Policy Blending as Inference for Reactive Robot Control [21.058662668187875]
Motion generation in cluttered, dense, and dynamic environments is a central topic in robotics.
We propose a hierarchical motion generation method that combines the benefits of reactive policies and planning.
Our experimental study in planar navigation and 6DoF manipulation shows that our proposed hierarchical motion generation method outperforms both myopic reactive controllers and online re-planning methods.
arXiv Detail & Related papers (2022-10-14T15:16:54Z) - Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account.
We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z) - Optimizing Indoor Navigation Policies For Spatial Distancing [8.635212273689273]
In this paper, we focus on the modification of policies that can lead to movement patterns and directional guidance of occupants.
We show that within our framework, the simulation-optimization process can help to improve spatial distancing between agents.
arXiv Detail & Related papers (2022-06-04T21:57:22Z) - Learning to Explore by Reinforcement over High-Level Options [0.0]
We propose a new method which grants an agent two intertwined options of behaviors: "look-around" and "frontier navigation"
In each timestep, an agent produces an option and a corresponding action according to the policy.
We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets.
arXiv Detail & Related papers (2021-11-02T04:21:34Z) - Localized active learning of Gaussian process state space models [63.97366815968177]
A globally accurate model is not required to achieve good performance in many common control applications.
We propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space.
By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy.
arXiv Detail & Related papers (2020-05-04T05:35:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.