SAR: Generalization of Physiological Agility and Dexterity via
Synergistic Action Representation
- URL: http://arxiv.org/abs/2307.03716v2
- Date: Fri, 14 Jul 2023 05:35:14 GMT
- Title: SAR: Generalization of Physiological Agility and Dexterity via
Synergistic Action Representation
- Authors: Cameron Berg, Vittorio Caggiano, Vikash Kumar
- Abstract summary: We show that modular control via muscle synergies enables organisms to learn muscle control in a simplified and generalizable action space.
We use physiologically accurate human hand and leg models as a testbed for determining the extent to which a Synergistic Action Representation (SAR) acquired from simpler tasks facilitates learning more complex tasks.
We find in both cases that SAR-exploiting policies significantly outperform end-to-end reinforcement learning.
- Score: 10.349135207285464
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning effective continuous control policies in high-dimensional systems,
including musculoskeletal agents, remains a significant challenge. Over the
course of biological evolution, organisms have developed robust mechanisms for
overcoming this complexity to learn highly sophisticated strategies for motor
control. What accounts for this robust behavioral flexibility? Modular control
via muscle synergies, i.e. coordinated muscle co-contractions, is considered to
be one putative mechanism that enables organisms to learn muscle control in a
simplified and generalizable action space. Drawing inspiration from this
evolved motor control strategy, we use physiologically accurate human hand and
leg models as a testbed for determining the extent to which a Synergistic
Action Representation (SAR) acquired from simpler tasks facilitates learning
more complex tasks. We find in both cases that SAR-exploiting policies
significantly outperform end-to-end reinforcement learning. Policies trained
with SAR were able to achieve robust locomotion on a wide set of terrains with
high sample efficiency, while baseline approaches failed to learn meaningful
behaviors. Additionally, policies trained with SAR on a multiobject
manipulation task significantly outperformed (>70% success) baseline approaches
(<20% success). Both of these SAR-exploiting policies were also found to
generalize zero-shot to out-of-domain environmental conditions, while policies
that did not adopt SAR failed to generalize. Finally, we establish the
generality of SAR on broader high-dimensional control problems using a robotic
manipulation task set and a full-body humanoid locomotion task. To the best of
our knowledge, this investigation is the first of its kind to present an
end-to-end pipeline for discovering synergies and using this representation to
learn high-dimensional continuous control across a wide diversity of tasks.
Related papers
- LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning [22.99690700210957]
We propose a novel HRL framework that leverages language instructions to generate a stationary reward function for a higher-level policy.
Since the language-guided reward is unaffected by the lower primitive behaviour, LGR2 mitigates non-stationarity.
Our approach attains success rates exceeding 70$%$ in challenging, sparse-reward robotic navigation and manipulation environments.
arXiv Detail & Related papers (2024-06-09T18:40:24Z) - Twisting Lids Off with Two Hands [82.21668778600414]
We show how policies trained in simulation can be effectively and efficiently transferred to the real world.
Specifically, we consider the problem of twisting lids of various bottle-like objects with two hands.
This is the first sim-to-real RL system that enables such capabilities on bimanual multi-fingered hands.
arXiv Detail & Related papers (2024-03-04T18:59:30Z) - RObotic MAnipulation Network (ROMAN) $\unicode{x2013}$ Hybrid
Hierarchical Learning for Solving Complex Sequential Tasks [70.69063219750952]
We present a Hybrid Hierarchical Learning framework, the Robotic Manipulation Network (ROMAN)
ROMAN achieves task versatility and robust failure recovery by integrating behavioural cloning, imitation learning, and reinforcement learning.
Experimental results show that by orchestrating and activating these specialised manipulation experts, ROMAN generates correct sequential activations for accomplishing long sequences of sophisticated manipulation tasks.
arXiv Detail & Related papers (2023-06-30T20:35:22Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL.
We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z) - Low-Rank Modular Reinforcement Learning via Muscle Synergy [25.120547719120765]
Modular Reinforcement Learning (RL) decentralizes the control of multi-joint robots by learning policies for each actuator.
We propose a Synergy-Oriented LeARning (SOLAR) framework that exploits the redundant nature of DoF in robot control.
arXiv Detail & Related papers (2022-10-26T16:01:31Z) - DMAP: a Distributed Morphological Attention Policy for Learning to
Locomote with a Changing Body [126.52031472297413]
We introduce DMAP, a biologically-inspired, attention-based policy network architecture.
We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations.
DMAP can be trained end-to-end in all the considered environments, overall matching or surpassing the performance of an oracle agent.
arXiv Detail & Related papers (2022-09-28T16:45:35Z) - DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated
and Musculoskeletal Systems [14.295720603503806]
Reinforcement learning on large musculoskeletal models has not been able to show similar performance.
We conjecture that ineffective exploration in large overactuated action spaces is a key problem.
By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems.
arXiv Detail & Related papers (2022-05-30T15:52:54Z) - Weakly Supervised Disentangled Representation for Goal-conditioned
Reinforcement Learning [15.698612710580447]
We propose a skill learning framework DR-GRL that aims to improve the sample efficiency and policy generalization.
In a weakly supervised manner, we propose a Spatial Transform AutoEncoder (STAE) to learn an interpretable and controllable representation.
We empirically demonstrate that DR-GRL significantly outperforms the previous methods in sample efficiency and policy generalization.
arXiv Detail & Related papers (2022-02-28T09:05:14Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - On the Emergence of Whole-body Strategies from Humanoid Robot
Push-recovery Learning [32.070068456106895]
We apply model-free Deep Reinforcement Learning for training a general and robust humanoid push-recovery policy in a simulation environment.
Our method targets high-dimensional whole-body humanoid control and is validated on the iCub humanoid.
arXiv Detail & Related papers (2021-04-29T17:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.