Learning Modular Robot Control Policies
- URL: http://arxiv.org/abs/2105.10049v1
- Date: Thu, 20 May 2021 21:54:37 GMT
- Title: Learning Modular Robot Control Policies
- Authors: Julian Whitman, Matthew Travers, and Howie Choset
- Abstract summary: We construct a modular control policy that handles a broad class of designs.
As the modules are physically re-configured, the policy automatically re-configures to match the kinematic structure.
We show that the policy can then generalize to a larger set of designs not seen during training.
- Score: 10.503109190599828
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To make a modular robotic system both capable and scalable, the controller
must be equally as modular as the mechanism. Given the large number of designs
that can be generated from even a small set of modules, it becomes impractical
to create a new system-wide controller for each design. Instead, we construct a
modular control policy that handles a broad class of designs. We take the view
that a module is both form and function, i.e. both mechanism and controller. As
the modules are physically re-configured, the policy automatically
re-configures to match the kinematic structure. This novel policy is trained
with a new model-based reinforcement learning algorithm, which interleaves
model learning and trajectory optimization to guide policy learning for
multiple designs simultaneously. Training the policy on a varied set of designs
teaches it how to adapt its behavior to the design. We show that the policy can
then generalize to a larger set of designs not seen during training. We
demonstrate one policy controlling many designs with different combinations of
legs and wheels to locomote both in simulation and on real robots.
Related papers
- Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - Learning Modular Robot Locomotion from Demonstrations [20.03751606751798]
This work presents a method that uses demonstrations from one set of designs to accelerate policy learning for additional designs.
In this paper we develop a combined reinforcement and imitation learning algorithm.
We show that when the modular policy is optimized with this combined objective, demonstrations from one set of designs influence how the policy behaves on a different design.
arXiv Detail & Related papers (2022-10-31T17:15:32Z) - Learning Modular Simulations for Homogeneous Systems [23.355189771765644]
We present a modular simulation framework for modeling homogeneous multibody dynamical systems.
An arbitrary number of modules can be combined to simulate systems of a variety of coupling topologies.
We show that our models can be transferred to new system configurations lower with data requirement and training effort, compared to those trained from scratch.
arXiv Detail & Related papers (2022-10-28T17:48:01Z) - Meta-Reinforcement Learning for Adaptive Control of Second Order Systems [3.131740922192114]
In process control, many systems have similar and well-understood dynamics, which suggests it is feasible to create a generalizable controller through meta-learning.
We formulate a meta reinforcement learning (meta-RL) control strategy that takes advantage of known, offline information for training, such as a model structure.
A key design element is the ability to leverage model-based information offline during training, while maintaining a model-free policy structure for interacting with new environments.
arXiv Detail & Related papers (2022-09-19T18:51:33Z) - Meta Reinforcement Learning for Adaptive Control: An Offline Approach [3.131740922192114]
We formulate a meta reinforcement learning (meta-RL) control strategy that takes advantage of known, offline information for training.
Our meta-RL agent has a recurrent structure that accumulates "context" for its current dynamics through a hidden state variable.
In tests reported here, the meta-RL agent was trained entirely offline, yet produced excellent results in novel settings.
arXiv Detail & Related papers (2022-03-17T23:58:52Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - One Policy to Control Them All: Shared Modular Policies for
Agent-Agnostic Control [47.78262874364569]
We investigate whether there exists a single global policy that can generalize to control a wide variety of agent morphologies.
We propose to express this global policy as a collection of identical modular neural networks.
We show that a single modular policy can successfully generate locomotion behaviors for several planar agents.
arXiv Detail & Related papers (2020-07-09T17:59:35Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.