Learning Modular Robot Locomotion from Demonstrations
- URL: http://arxiv.org/abs/2210.17491v1
- Date: Mon, 31 Oct 2022 17:15:32 GMT
- Title: Learning Modular Robot Locomotion from Demonstrations
- Authors: Julian Whitman and Howie Choset
- Abstract summary: This work presents a method that uses demonstrations from one set of designs to accelerate policy learning for additional designs.
In this paper we develop a combined reinforcement and imitation learning algorithm.
We show that when the modular policy is optimized with this combined objective, demonstrations from one set of designs influence how the policy behaves on a different design.
- Score: 20.03751606751798
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modular robots can be reconfigured to create a variety of designs from a
small set of components. But constructing a robot's hardware on its own is not
enough -- each robot needs a controller. One could create controllers for some
designs individually, but developing policies for additional designs can be
time consuming. This work presents a method that uses demonstrations from one
set of designs to accelerate policy learning for additional designs. We
leverage a learning framework in which a graph neural network is made up of
modular components, each component corresponds to a type of module (e.g., a
leg, wheel, or body) and these components can be recombined to learn from
multiple designs at once. In this paper we develop a combined reinforcement and
imitation learning algorithm. Our method is novel because the policy is
optimized to both maximize a reward for one design, and simultaneously imitate
demonstrations from different designs, within one objective function. We show
that when the modular policy is optimized with this combined objective,
demonstrations from one set of designs influence how the policy behaves on a
different design, decreasing the number of training iterations needed.
Related papers
- One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion [18.556470359899855]
We introduce URMA, the Unified Robot Morphology Architecture.
Our framework brings the end-to-end Multi-Task Reinforcement Learning approach to the realm of legged robots.
We show that URMA can learn a locomotion policy on multiple embodiments that can be easily transferred to unseen robot platforms.
arXiv Detail & Related papers (2024-09-10T09:44:15Z) - MeMo: Meaningful, Modular Controllers via Noise Injection [25.541496793132183]
We show that when a new robot is built from the same parts, its control can be quickly learned by reusing the modular controllers.
We achieve this with a framework called MeMo which learns (Me)aningful, (Mo)dular controllers.
We benchmark our framework in locomotion and grasping environments on simple to complex robot morphology transfer.
arXiv Detail & Related papers (2024-05-24T18:39:20Z) - Compositional Generative Inverse Design [69.22782875567547]
Inverse design, where we seek to design input variables in order to optimize an underlying objective function, is an important problem.
We show that by instead optimizing over the learned energy function captured by the diffusion model, we can avoid such adversarial examples.
In an N-body interaction task and a challenging 2D multi-airfoil design task, we demonstrate that by composing the learned diffusion model at test time, our method allows us to design initial states and boundary shapes.
arXiv Detail & Related papers (2024-01-24T01:33:39Z) - Learning to Design and Use Tools for Robotic Manipulation [21.18538869008642]
Recent techniques for jointly optimizing morphology and control via deep learning are effective at designing locomotion agents.
We propose learning a designer policy, rather than a single design.
We show that this framework is more sample efficient than prior methods in multi-goal or multi-variant settings.
arXiv Detail & Related papers (2023-11-01T18:00:10Z) - Polybot: Training One Policy Across Robots While Embracing Variability [70.74462430582163]
We propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms.
Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras.
We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3 robots with varying joint configurations and sizes.
arXiv Detail & Related papers (2023-07-07T17:21:16Z) - Universal Morphology Control via Contextual Modulation [52.742056836818136]
Learning a universal policy across different robot morphologies can significantly improve learning efficiency and generalization in continuous control.
Existing methods utilize graph neural networks or transformers to handle heterogeneous state and action spaces across different morphologies.
We propose a hierarchical architecture to better model this dependency via contextual modulation.
arXiv Detail & Related papers (2023-02-22T00:04:12Z) - Meta Reinforcement Learning for Optimal Design of Legged Robots [9.054187238463212]
We present a design optimization framework using model-free meta reinforcement learning.
We show that our approach allows higher performance while not being constrained by predefined motions or gait patterns.
arXiv Detail & Related papers (2022-10-06T08:37:52Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated
Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects.
Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z) - Learning Modular Robot Control Policies [10.503109190599828]
We construct a modular control policy that handles a broad class of designs.
As the modules are physically re-configured, the policy automatically re-configures to match the kinematic structure.
We show that the policy can then generalize to a larger set of designs not seen during training.
arXiv Detail & Related papers (2021-05-20T21:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.