Related papers: Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

URL: http://arxiv.org/abs/2402.06570v2
Date: Mon, 3 Jun 2024 20:02:33 GMT
Title: Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Authors: Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson,
Abstract summary: Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies. To achieve both good performance like TF and high efficiency at inference time, we propose HyperDistill. We show that on UNIMAL, a benchmark with hundreds of diverse morphologies, HyperDistill performs as well as a universal TF teacher policy on both training and unseen test robots, but reduces model size by 6-14 times, and computational cost by 67-160 times in different environments.
Score: 34.40439673925125
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies. However, learning a highly performant universal policy requires sophisticated architectures like transformers (TF) that have larger memory and computational cost than simpler multi-layer perceptrons (MLP). To achieve both good performance like TF and high efficiency like MLP at inference time, we propose HyperDistill, which consists of: (1) A morphology-conditioned hypernetwork (HN) that generates robot-wise MLP policies, and (2) A policy distillation approach that is essential for successful training. We show that on UNIMAL, a benchmark with hundreds of diverse morphologies, HyperDistill performs as well as a universal TF teacher policy on both training and unseen test robots, but reduces model size by 6-14 times, and computational cost by 67-160 times in different environments. Our analysis attributes the efficiency advantage of HyperDistill at inference time to knowledge decoupling, i.e., the ability to decouple inter-task and intra-task knowledge, a general principle that could also be applied to improve inference efficiency in other domains.

Related papers

Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
PUMA: Layer-Pruned Language Model for Efficient Unified Multimodal Retrieval with Modality-Adaptive Learning [54.73049408950049]
We propose a Layer-Pruned Language Model for Efficient Unified Multimodal Retrieval with Modality-Adaptive Learning.<n>Our approach improves unified multimodal retrieval from both structural and learning perspectives.
arXiv Detail & Related papers (2025-07-10T16:47:25Z)
Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning [59.001091197106085]
Multi-Task Learning (MTL) for Vision Transformer aims at enhancing the model capability by tackling multiple tasks simultaneously. Most recent works have predominantly focused on designing Mixture-of-Experts (MoE) structures and in tegrating Low-Rank Adaptation (LoRA) to efficiently perform multi-task learning. We propose a novel approach dubbed Efficient Multi-Task Learning (EMTAL) by transforming a pre-trained Vision Transformer into an efficient multi-task learner.
arXiv Detail & Related papers (2025-01-12T17:41:23Z)
Capability-Aware Shared Hypernetworks for Flexible Heterogeneous Multi-Robot Coordination [2.681242476043447]
We propose Capability-Aware Shared Hypernetworks (CASH) to enable a single architecture to dynamically adapt to each robot and the current context. CASH encodes shared decision making strategies that can be adapted to each robot based on local observations and the robots' individual and collective capabilities.
arXiv Detail & Related papers (2025-01-10T15:39:39Z)
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning [61.294110816231886]
We introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP) SDP selectively activates experts and skills, enabling efficient and task-specific learning without retraining the entire model. Demos and codes can be found in https://forrest-110.io/sparse_diffusion_policy/.
arXiv Detail & Related papers (2024-07-01T17:59:56Z)
Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression [53.33734159983431]
This paper introduces a novel approach to distill neural RL policies into more interpretable forms. We train expert neural network policies using RL and distill them into (i) GBMs, (ii) EBMs, and (iii) symbolic policies.
arXiv Detail & Related papers (2024-03-21T11:54:45Z)
ManyQuadrupeds: Learning a Single Locomotion Policy for Diverse Quadruped Robots [4.557963624437784]
We show that drawing inspiration from animal motor control allows us to effectively train a single locomotion policy for quadruped robots. Our policy modulates a representation of the Central Pattern Generator (CPG) in the spinal cord. We observe robust performance, even when adding a 15 kg load, equivalent to 125% of the A1 robot's nominal mass.
arXiv Detail & Related papers (2023-10-16T15:06:16Z)
Universal Morphology Control via Contextual Modulation [52.742056836818136]
Learning a universal policy across different robot morphologies can significantly improve learning efficiency and generalization in continuous control. Existing methods utilize graph neural networks or transformers to handle heterogeneous state and action spaces across different morphologies. We propose a hierarchical architecture to better model this dependency via contextual modulation.
arXiv Detail & Related papers (2023-02-22T00:04:12Z)
Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning [0.38073142980732994]
Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyper parameters, along with a mechanism for choosing the best performing set. Online weighted Q-Ensemble presented overall lower variance and superior results when compared with q-average ensembles.
arXiv Detail & Related papers (2022-09-29T19:57:43Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Efficient Feature Transformations for Discriminative and Generative Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning. Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture. We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z)
MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models [4.84279798426797]
Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials. We present a novel multi-agent model-based RL algorithm: Multi-Agent Model-Based Policy Optimization (MAMBPO)
arXiv Detail & Related papers (2021-03-05T13:37:23Z)
Learning Whole-body Motor Skills for Humanoids [25.443880385966114]
This paper presents a hierarchical framework for Deep Reinforcement Learning that acquires motor skills for a variety of push recovery and balancing behaviors. The policy is trained in a physics simulator with realistic setting of robot model and low-level impedance control that are easy to transfer the learned skills to real robots.
arXiv Detail & Related papers (2020-02-07T19:40:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.