Hierarchically Decoupled Imitation for Morphological Transfer
- URL: http://arxiv.org/abs/2003.01709v2
- Date: Mon, 31 Aug 2020 07:26:59 GMT
- Title: Hierarchically Decoupled Imitation for Morphological Transfer
- Authors: Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto
- Abstract summary: We show that transferring learned information from a morphologically simpler agent can massively improve the sample efficiency of a more complex one.
First, we show that incentivizing a complex agent's low-level to imitate a simpler agent's low-level significantly improves zero-shot high-level transfer.
Second, we show that KL-regularized training of the high level stabilizes learning and prevents mode-collapse.
- Score: 95.19299356298876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning long-range behaviors on complex high-dimensional agents is a
fundamental problem in robot learning. For such tasks, we argue that
transferring learned information from a morphologically simpler agent can
massively improve the sample efficiency of a more complex one. To this end, we
propose a hierarchical decoupling of policies into two parts: an independently
learned low-level policy and a transferable high-level policy. To remedy poor
transfer performance due to mismatch in morphologies, we contribute two key
ideas. First, we show that incentivizing a complex agent's low-level to imitate
a simpler agent's low-level significantly improves zero-shot high-level
transfer. Second, we show that KL-regularized training of the high level
stabilizes learning and prevents mode-collapse. Finally, on a suite of publicly
released navigation and manipulation environments, we demonstrate the
applicability of hierarchical transfer on long-range tasks across morphologies.
Our code and videos can be found at
https://sites.google.com/berkeley.edu/morphology-transfer.
Related papers
- Multi-Agent Transfer Learning via Temporal Contrastive Learning [8.487274986507922]
This paper introduces a novel transfer learning framework for deep multi-agent reinforcement learning.
The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals.
arXiv Detail & Related papers (2024-06-03T14:42:14Z) - Reinforcement Learning with Options and State Representation [105.82346211739433]
This thesis aims to explore the reinforcement learning field and build on existing methods to produce improved ones.
It addresses such goals by decomposing learning tasks in a hierarchical fashion known as Hierarchical Reinforcement Learning.
arXiv Detail & Related papers (2024-03-16T08:30:55Z) - Investigating the role of model-based learning in exploration and
transfer [11.652741003589027]
In this paper, we investigate transfer learning in the context of model-based agents.
We find that a model-based approach outperforms controlled model-free baselines for transfer learning.
Our results show that intrinsic exploration combined with environment models present a viable direction towards agents that are self-supervised and able to generalize to novel reward functions.
arXiv Detail & Related papers (2023-02-08T11:49:58Z) - TransfQMix: Transformers for Leveraging the Graph Structure of
Multi-Agent Reinforcement Learning Problems [0.0]
We present TransfQMix, a new approach that uses transformers to leverage a latent graph structure and learn better coordination policies.
Our transformer Q-mixer learns a monotonic mixing-function from a larger graph that includes the internal and external states of the agents.
We report TransfQMix's performances in the Spread and StarCraft II environments.
arXiv Detail & Related papers (2023-01-13T00:07:08Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Adaptive Policy Transfer in Reinforcement Learning [9.594432031144715]
We introduce a principled mechanism that can "Adapt-to-Learn", that is adapt the source policy to learn to solve a target task.
We show that the presented method learns to seamlessly combine learning from adaptation and exploration and leads to a robust policy transfer algorithm.
arXiv Detail & Related papers (2021-05-10T22:42:03Z) - Task-Agnostic Morphology Evolution [94.97384298872286]
Current approaches that co-adapt morphology and behavior use a specific task's reward as a signal for morphology optimization.
This often requires expensive policy optimization and results in task-dependent morphologies that are not built to generalize.
We propose a new approach, Task-Agnostic Morphology Evolution (TAME), to alleviate both of these issues.
arXiv Detail & Related papers (2021-02-25T18:59:21Z) - Continuous Transition: Improving Sample Efficiency for Continuous
Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition.
Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions.
To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z) - RODE: Learning Roles to Decompose Multi-Agent Tasks [69.56458960841165]
Role-based learning holds the promise of achieving scalable multi-agent learning by decomposing complex tasks using roles.
We propose to first decompose joint action spaces into restricted role action spaces by clustering actions according to their effects on the environment and other agents.
By virtue of these advances, our method outperforms the current state-of-the-art MARL algorithms on 10 of the 14 scenarios that comprise the challenging StarCraft II micromanagement benchmark.
arXiv Detail & Related papers (2020-10-04T09:20:59Z) - How Transferable are the Representations Learned by Deep Q Agents? [13.740174266824532]
We consider the source of Deep Reinforcement Learning's sample complexity.
We compare the benefits of transfer learning to learning a policy from scratch.
We find that benefits due to transfer are highly variable in general and non-symmetric across pairs of tasks.
arXiv Detail & Related papers (2020-02-24T00:23:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.