DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts
- URL: http://arxiv.org/abs/2508.07842v2
- Date: Mon, 22 Sep 2025 12:52:57 GMT
- Title: DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts
- Authors: Yutong Shen, Hangxu Liu, Lei Zhang, Penghui Liu, Ruizhe Xia, Tianyi Yao, Tongtong Feng,
- Abstract summary: DETACH is a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement.<n>It can achieve an average subtasks success rate improvement of 23% and average execution efficiency improvement of 29%.
- Score: 6.15749307717446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents DETACH, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, DETACH comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, DETACH can achieve an average subtasks success rate improvement of 23% and average execution efficiency improvement of 29%.
Related papers
- Separation and Collaboration: Two-Level Routing Grouped Mixture-of-Experts for Multi-Domain Continual Learning [7.361665112773847]
We propose a Two-Level Grouped Mixture Routing-of-Experts (TRGE) method to mitigate catastrophic forgetting.<n> TRGE dynamically expands the pre-trained CLIP model, assigning specific expert group for each task.<n>We leverage Multimodal Large Language Models (MLLMs) which own powerful multimodal comprehension capabilities to generate task descriptions and recognize the correct task identifier.
arXiv Detail & Related papers (2025-08-11T08:18:22Z) - CKAA: Cross-subspace Knowledge Alignment and Aggregation for Robust Continual Learning [80.18781219542016]
Continual Learning (CL) empowers AI models to continuously learn from sequential task streams.<n>Recent parameter-efficient fine-tuning (PEFT)-based CL methods have garnered increasing attention due to their superior performance.<n>We propose Cross-subspace Knowledge Alignment and Aggregation (CKAA) to enhance robustness against misleading task-ids.
arXiv Detail & Related papers (2025-07-13T03:11:35Z) - Mixture-of-Experts Meets In-Context Reinforcement Learning [29.866936147753368]
In this paper, we introduce textbfT2MIR (textbfToken- and textbfTask-wise textbfMoE for textbfIn-context textbfRL), an innovative framework that introduces architectural advances of mixture-of-experts (MoE) into transformer-based decision models.<n> Comprehensive experiments show that T2MIR significantly facilitates in-context learning capacity and outperforms various types of baselines.
arXiv Detail & Related papers (2025-06-05T06:29:14Z) - Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces [52.649077293256795]
Continual offline reinforcement learning (CORL) has shown impressive ability in diffusion-based lifelong learning systems.
We propose Vector-Quantized Continual diffuser, named VQ-CD, to break the barrier of different spaces between various tasks.
arXiv Detail & Related papers (2024-10-21T07:13:45Z) - Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement
Learning with Dynamic Depth Routing [26.44273671379482]
Multi-task reinforcement learning endeavors to accomplish a set of different tasks with a single policy.
This work presents a Dynamic Depth Routing (D2R) framework, which learns strategic skipping of certain intermediate modules, thereby flexibly choosing different numbers of modules for each task.
In addition, we design an automatic route-balancing mechanism to encourage continued routing exploration for unmastered tasks without disturbing the routing of mastered ones.
arXiv Detail & Related papers (2023-12-22T06:51:30Z) - DenseMTL: Cross-task Attention Mechanism for Dense Multi-task Learning [18.745373058797714]
We propose a novel multi-task learning architecture that leverages pairwise cross-task exchange through correlation-guided attention and self-attention.
We conduct extensive experiments across three multi-task setups, showing the advantages of our approach compared to competitive baselines in both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2022-06-17T17:59:45Z) - Self-Taught Cross-Domain Few-Shot Learning with Weakly Supervised Object
Localization and Task-Decomposition [84.24343796075316]
We propose a task-expansion-decomposition framework for Cross-Domain Few-Shot Learning.
The proposed Self-Taught (ST) approach alleviates the problem of non-target guidance by constructing task-oriented metric spaces.
We conduct experiments under the cross-domain setting including 8 target domains: CUB, Cars, Places, Plantae, CropDieases, EuroSAT, ISIC, and ChestX.
arXiv Detail & Related papers (2021-09-03T04:23:07Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.