muNet: Evolving Pretrained Deep Neural Networks into Scalable
Auto-tuning Multitask Systems
- URL: http://arxiv.org/abs/2205.10937v2
- Date: Wed, 25 May 2022 12:49:04 GMT
- Title: muNet: Evolving Pretrained Deep Neural Networks into Scalable
Auto-tuning Multitask Systems
- Authors: Andrea Gesmundo and Jeff Dean
- Abstract summary: Most uses of machine learning today involve training a model from scratch for a particular task, or starting with a model pretrained on a related task and then fine-tuning on a downstream task.
We propose a method that uses the layers of a pretrained deep neural network as building blocks to construct an ML system that can jointly solve an arbitrary number of tasks.
The resulting system can leverage cross tasks knowledge transfer, while being immune from common drawbacks of multitask approaches such as catastrophic forgetting, gradients interference and negative transfer.
- Score: 4.675744559395732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most uses of machine learning today involve training a model from scratch for
a particular task, or sometimes starting with a model pretrained on a related
task and then fine-tuning on a downstream task. Both approaches offer limited
knowledge transfer between different tasks, time-consuming human-driven
customization to individual tasks and high computational costs especially when
starting from randomly initialized models. We propose a method that uses the
layers of a pretrained deep neural network as building blocks to construct an
ML system that can jointly solve an arbitrary number of tasks. The resulting
system can leverage cross tasks knowledge transfer, while being immune from
common drawbacks of multitask approaches such as catastrophic forgetting,
gradients interference and negative transfer. We define an evolutionary
approach designed to jointly select the prior knowledge relevant for each task,
choose the subset of the model parameters to train and dynamically auto-tune
its hyperparameters. Furthermore, a novel scale control method is employed to
achieve quality/size trade-offs that outperform common fine-tuning techniques.
Compared with standard fine-tuning on a benchmark of 10 diverse image
classification tasks, the proposed model improves the average accuracy by 2.39%
while using 47% less parameters per task.
Related papers
- AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Multi-Objective Optimization for Sparse Deep Multi-Task Learning [0.0]
We present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs)
Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with particular focus on Deep Multi-Task models.
arXiv Detail & Related papers (2023-08-23T16:42:27Z) - An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
Multitask Learning Systems [4.675744559395732]
Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer.
State of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks.
We propose an evolutionary method that can generate a large scale multitask model and can support the dynamic and continuous addition of new tasks.
arXiv Detail & Related papers (2022-05-25T13:10:47Z) - A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong
Reinforcement Learning [11.076005074172516]
reinforcement learning algorithms can easily encounter catastrophic forgetting or interference when faced with lifelong streaming information.
We propose a scalable lifelong RL method that dynamically expands the network capacity to accommodate new knowledge.
We show that our method successfully facilitates scalable lifelong RL and outperforms relevant existing methods.
arXiv Detail & Related papers (2022-05-22T09:48:41Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Piggyback GAN: Efficient Lifelong Learning for Image Conditioned
Generation [48.31909975623379]
We propose a parameter efficient framework, Piggyback GAN, which learns the current task by building a set of convolutional and deconvolutional filters.
For the current task, our model achieves high generation quality on par with a standalone model at a lower number of parameters.
We validate Piggyback GAN on various image-conditioned generation tasks across different domains.
arXiv Detail & Related papers (2021-04-24T12:09:52Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Dynamic Task Weighting Methods for Multi-task Networks in Autonomous
Driving Systems [10.625400639764734]
Deep multi-task networks are of particular interest for autonomous driving systems.
We propose a novel method combining evolutionary meta-learning and task-based selective backpropagation.
Our method outperforms state-of-the-art methods by a significant margin on a two-task application.
arXiv Detail & Related papers (2020-01-07T18:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.