Modular Networks Prevent Catastrophic Interference in Model-Based
Multi-Task Reinforcement Learning
- URL: http://arxiv.org/abs/2111.08010v1
- Date: Mon, 15 Nov 2021 12:31:31 GMT
- Title: Modular Networks Prevent Catastrophic Interference in Model-Based
Multi-Task Reinforcement Learning
- Authors: Robin Schiewer and Laurenz Wiskott
- Abstract summary: We study whether model-based multi-task reinforcement learning benefits from shared dynamics models in a similar way model-free methods do from shared policy networks.
Using a single dynamics model, we see clear evidence of task confusion and reduced performance.
As a remedy, enforcing an internal structure for the learned dynamics model by training isolated sub-networks for each task notably improves performance.
- Score: 0.8883733362171032
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a multi-task reinforcement learning setting, the learner commonly benefits
from training on multiple related tasks by exploiting similarities among them.
At the same time, the trained agent is able to solve a wider range of different
problems. While this effect is well documented for model-free multi-task
methods, we demonstrate a detrimental effect when using a single learned
dynamics model for multiple tasks. Thus, we address the fundamental question of
whether model-based multi-task reinforcement learning benefits from shared
dynamics models in a similar way model-free methods do from shared policy
networks. Using a single dynamics model, we see clear evidence of task
confusion and reduced performance. As a remedy, enforcing an internal structure
for the learned dynamics model by training isolated sub-networks for each task
notably improves performance while using the same amount of parameters. We
illustrate our findings by comparing both methods on a simple gridworld and a
more complex vizdoom multi-task experiment.
Related papers
- LIMT: Language-Informed Multi-Task Visual World Models [6.128332310539627]
Multi-task reinforcement learning can be very challenging due to the increased sample complexity and the potentially conflicting task objectives.
We propose a method for learning multi-task visual world models, leveraging pre-trained language models to extract semantically meaningful task representations.
Our results highlight the benefits of using language-driven task representations for world models and a clear advantage of model-based multi-task learning over the more common model-free paradigm.
arXiv Detail & Related papers (2024-07-18T12:40:58Z) - Merging Multi-Task Models via Weight-Ensembling Mixture of Experts [64.94129594112557]
Merging Transformer-based models trained on different tasks into a single unified model can execute all the tasks concurrently.
Previous methods, exemplified by task arithmetic, have been proven to be both effective and scalable.
We propose to merge most of the parameters while upscaling the Transformer layers to a weight-ensembling mixture of experts (MoE) module.
arXiv Detail & Related papers (2024-02-01T08:58:57Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - DEPHN: Different Expression Parallel Heterogeneous Network using virtual
gradient optimization for Multi-task Learning [1.0705399532413615]
Recommendation system algorithm based on multi-task learning (MTL) is the major method for Internet operators to understand users and predict their behaviors.
Traditional models use shared-bottom models and gating experts to realize shared representation learning and information differentiation.
We propose an Different Expression Parallel Heterogeneous Network (DEPHN) to model multiple tasks simultaneously.
arXiv Detail & Related papers (2023-07-24T04:29:00Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - The Effect of Diversity in Meta-Learning [79.56118674435844]
Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples.
Recent studies show that task distribution plays a vital role in the model's performance.
We study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms.
arXiv Detail & Related papers (2022-01-27T19:39:07Z) - Rethinking Hard-Parameter Sharing in Multi-Task Learning [20.792654758645302]
Hard parameter sharing in multi-task learning (MTL) allows tasks to share some of model parameters, reducing storage cost and improving prediction accuracy.
The common sharing practice is to share bottom layers of a deep neural network among tasks while using separate top layers for each task.
Using separate bottom-layer parameters could achieve significantly better performance than the common practice.
arXiv Detail & Related papers (2021-07-23T17:26:40Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.