Related papers: Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

URL: http://arxiv.org/abs/2407.01531v2
Date: Thu, 24 Oct 2024 22:01:44 GMT
Title: Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
Authors: Yixiao Wang, Yifei Zhang, Mingxiao Huo, Ran Tian, Xiang Zhang, Yichen Xie, Chenfeng Xu, Pengliang Ji, Wei Zhan, Mingyu Ding, Masayoshi Tomizuka,
Abstract summary: We introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP) SDP selectively activates experts and skills, enabling efficient and task-specific learning without retraining the entire model. Demos and codes can be found in https://forrest-110.io/sparse_diffusion_policy/.
Score: 61.294110816231886
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The increasing complexity of tasks in robotics demands efficient strategies for multitask and continual learning. Traditional models typically rely on a universal policy for all tasks, facing challenges such as high computational costs and catastrophic forgetting when learning new tasks. To address these issues, we introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP). By adopting Mixture of Experts (MoE) within a transformer-based diffusion policy, SDP selectively activates experts and skills, enabling efficient and task-specific learning without retraining the entire model. SDP not only reduces the burden of active parameters but also facilitates the seamless integration and reuse of experts across various tasks. Extensive experiments on diverse tasks in both simulations and real world show that SDP 1) excels in multitask scenarios with negligible increases in active parameters, 2) prevents forgetting in continual learning of new tasks, and 3) enables efficient task transfer, offering a promising solution for advanced robotic applications. Demos and codes can be found in https://forrest-110.github.io/sparse_diffusion_policy/.

Related papers

Scaling Tasks, Not Samples: Mastering Humanoid Control through Multi-Task Model-Based Reinforcement Learning [49.82882141491629]
We argue that effective online learning should scale the emphnumber of tasks, rather than the number of samples per task.<n>This regime reveals a structural advantage of model-based reinforcement learning.<n>We instantiate this idea with textbfEfficientZero-Multitask (EZ-M), a sample-efficient multi-task algorithm for online learning.
arXiv Detail & Related papers (2026-03-02T05:07:43Z)
VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing [89.48383845451717]
We propose VER, a Vision Expert transformer for Robot learning.<n>During pretraining, VER distills multiple VFMs into a vision expert library.<n>It then fine-tunes only a lightweight routing network to dynamically select task-relevant experts.
arXiv Detail & Related papers (2025-10-06T18:00:43Z)
Data-Efficient Multitask DAgger [15.497645748861913]
Generalist robot policies that can perform many tasks typically require extensive expert data or simulations for training.<n>We propose a novel Data-Efficient multitask DAgger framework that distills a single multitask policy from multiple task-specific expert policies.
arXiv Detail & Related papers (2025-09-29T20:17:35Z)
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners [60.75160178669076]
We show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online reinforcement learning.<n>We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL.
arXiv Detail & Related papers (2025-05-29T06:41:45Z)
SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents [5.343921650701002]
We propose a novel adaptive task-switching methodology for RL-based multi-task learning in autonomous agents. SwitchMT employs a Deep Spiking Q-Network with active dendrites and dueling structure to create specialized sub-networks. It achieves superior performance in multi-task learning compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-18T08:12:59Z)
Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks [40.2989900672992]
We propose a novel approach for learning a policy committee that includes at least one near-optimal policy with high probability for tasks encountered during execution.<n>Our experiments on MuJoCo and Meta-World show that the proposed approach outperforms state-of-the-art multi-task, meta-, and task clustering baselines in training, generalization, and few-shot learning.
arXiv Detail & Related papers (2025-02-26T22:45:25Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning [8.860366821983211]
STRAP is a technique for leveraging pre-trained vision foundation models and dynamic time warping to retrieve sub-sequences of trajectories from large training corpora in a robust fashion. This work proposes STRAP, a technique for leveraging pre-trained vision foundation models and dynamic time warping to retrieve sub-sequences of trajectories from large training corpora in a robust fashion.
arXiv Detail & Related papers (2024-12-19T18:54:06Z)
Active Fine-Tuning of Multi-Task Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget.<n>We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z)
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning [36.0274770291531]
We propose Equibot, a robust, data-efficient, and generalizable approach for robot manipulation task learning. Our approach combines SIM(3)-equivariant neural network architectures with diffusion models. We show that our method can easily generalize to novel objects and scenes after learning from just 5 minutes of human demonstrations in each task.
arXiv Detail & Related papers (2024-07-01T17:09:43Z)
GenSim: Generating Robotic Simulation Tasks via Large Language Models [34.79613485106202]
GenSim aims to automatically generate rich simulation environments and expert demonstrations. We use GPT4 to expand the existing benchmark by ten times to over 100 tasks. With minimal sim-to-real adaptation, multitask policies pretrained on GPT4-generated simulation tasks exhibit stronger transfer to unseen long-horizon tasks in the real world.
arXiv Detail & Related papers (2023-10-02T17:23:48Z)
Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning [23.45043290237396]
MoSS is a context-based Meta-reinforcement learning algorithm based on Self-Supervised task representation learning. On MuJoCo and Meta-World benchmarks, MoSS outperforms prior in terms of performance, sample efficiency (3-50x faster), adaptation efficiency, and generalization.
arXiv Detail & Related papers (2023-04-29T15:46:19Z)
Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints. We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z)
Discovery of Options via Meta-Learned Subgoals [59.2160583043938]
Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster. We introduce a novel meta-gradient approach for discovering useful options in multi-task RL environments.
arXiv Detail & Related papers (2021-02-12T19:50:40Z)
Learn Dynamic-Aware State Embedding for Transfer Learning [0.8756822885568589]
We consider the setting where all tasks (MDPs) share the same environment dynamic except reward function. In this setting, the MDP dynamic is a good knowledge to transfer, which can be inferred by uniformly random policy. We observe that the binary MDP dynamic can be inferred from trajectories of any policy which avoids the need of uniform random policy.
arXiv Detail & Related papers (2021-01-06T19:07:31Z)
Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature. First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning) Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z)
HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections [96.64246471034195]
We propose textscHyperGrid, a new approach for highly effective multi-task learning. Our method helps bridge the gap between fine-tuning and multi-task learning approaches.
arXiv Detail & Related papers (2020-07-12T02:49:16Z)
Multi-Task Reinforcement Learning with Soft Modularization [25.724764855681137]
Multi-task learning is a very challenging problem in reinforcement learning. We introduce an explicit modularization technique on policy representation to alleviate this optimization issue. We show our method improves both sample efficiency and performance over strong baselines by a large margin.
arXiv Detail & Related papers (2020-03-30T17:47:04Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.