Multi-Task Reinforcement Learning with Soft Modularization
- URL: http://arxiv.org/abs/2003.13661v2
- Date: Mon, 7 Dec 2020 07:14:11 GMT
- Title: Multi-Task Reinforcement Learning with Soft Modularization
- Authors: Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang
- Abstract summary: Multi-task learning is a very challenging problem in reinforcement learning.
We introduce an explicit modularization technique on policy representation to alleviate this optimization issue.
We show our method improves both sample efficiency and performance over strong baselines by a large margin.
- Score: 25.724764855681137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task learning is a very challenging problem in reinforcement learning.
While training multiple tasks jointly allow the policies to share parameters
across different tasks, the optimization problem becomes non-trivial: It
remains unclear what parameters in the network should be reused across tasks,
and how the gradients from different tasks may interfere with each other. Thus,
instead of naively sharing parameters across tasks, we introduce an explicit
modularization technique on policy representation to alleviate this
optimization issue. Given a base policy network, we design a routing network
which estimates different routing strategies to reconfigure the base network
for each task. Instead of directly selecting routes for each task, our
task-specific policy uses a method called soft modularization to softly combine
all the possible routes, which makes it suitable for sequential tasks. We
experiment with various robotics manipulation tasks in simulation and show our
method improves both sample efficiency and performance over strong baselines by
a large margin.
Related papers
- Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [70.96345405979179]
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy applicable to diverse tasks without the need for online environmental interaction.
variations in task content and complexity pose significant challenges in policy formulation.
We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task.
arXiv Detail & Related papers (2024-11-02T05:49:14Z) - HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [72.25707314772254]
We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task.
The upper level of this framework is dedicated to learning a task-specific mask that delineates the harmony subspace, while the inner level focuses on updating parameters to enhance the overall performance of the unified policy.
arXiv Detail & Related papers (2024-05-28T11:41:41Z) - Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement
Learning with Dynamic Depth Routing [26.44273671379482]
Multi-task reinforcement learning endeavors to accomplish a set of different tasks with a single policy.
This work presents a Dynamic Depth Routing (D2R) framework, which learns strategic skipping of certain intermediate modules, thereby flexibly choosing different numbers of modules for each task.
In addition, we design an automatic route-balancing mechanism to encourage continued routing exploration for unmastered tasks without disturbing the routing of mastered ones.
arXiv Detail & Related papers (2023-12-22T06:51:30Z) - MetaModulation: Learning Variational Feature Hierarchies for Few-Shot
Learning with Fewer Tasks [63.016244188951696]
We propose a method for few-shot learning with fewer tasks, which is by metaulation.
We modify parameters at various batch levels to increase the meta-training tasks.
We also introduce learning variational feature hierarchies by incorporating the variationalulation.
arXiv Detail & Related papers (2023-05-17T15:47:47Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Small Towers Make Big Differences [59.243296878666285]
Multi-task learning aims at solving multiple machine learning tasks at the same time.
A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal.
We propose a method of under- parameterized self-auxiliaries for multi-task models to achieve the best of both worlds.
arXiv Detail & Related papers (2020-08-13T10:45:31Z) - Dynamic Task Weighting Methods for Multi-task Networks in Autonomous
Driving Systems [10.625400639764734]
Deep multi-task networks are of particular interest for autonomous driving systems.
We propose a novel method combining evolutionary meta-learning and task-based selective backpropagation.
Our method outperforms state-of-the-art methods by a significant margin on a two-task application.
arXiv Detail & Related papers (2020-01-07T18:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.