Compositional Models: Multi-Task Learning and Knowledge Transfer with
Modular Networks
- URL: http://arxiv.org/abs/2107.10963v1
- Date: Fri, 23 Jul 2021 00:05:55 GMT
- Title: Compositional Models: Multi-Task Learning and Knowledge Transfer with
Modular Networks
- Authors: Andrey Zhmoginov, Dina Bashkirova and Mark Sandler
- Abstract summary: We propose a new approach for learning modular networks based on the isometric version of ResNet.
In our method, the modules can be invoked repeatedly and allow knowledge transfer to novel tasks.
We show that our method leads to interpretable self-organization of modules in case of multi-task learning, transfer learning and domain adaptation.
- Score: 13.308477955656592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conditional computation and modular networks have been recently proposed for
multitask learning and other problems as a way to decompose problem solving
into multiple reusable computational blocks. We propose a new approach for
learning modular networks based on the isometric version of ResNet with all
residual blocks having the same configuration and the same number of
parameters. This architectural choice allows adding, removing and changing the
order of residual blocks. In our method, the modules can be invoked repeatedly
and allow knowledge transfer to novel tasks by adjusting the order of
computation. This allows soft weight sharing between tasks with only a small
increase in the number of parameters. We show that our method leads to
interpretable self-organization of modules in case of multi-task learning,
transfer learning and domain adaptation while achieving competitive results on
those tasks. From practical perspective, our approach allows to: (a) reuse
existing modules for learning new task by adjusting the computation order, (b)
use it for unsupervised multi-source domain adaptation to illustrate that
adaptation to unseen data can be achieved by only manipulating the order of
pretrained modules, (c) show how our approach can be used to increase accuracy
of existing architectures for image classification tasks such as ImageNet,
without any parameter increase, by reusing the same block multiple times.
Related papers
- Multi-Domain Learning with Modulation Adapters [33.54630534228469]
Multi-domain learning aims to handle related tasks, such as image classification across multiple domains, simultaneously.
Modulation Adapters update the convolutional weights of the model in a multiplicative manner for each task.
Our approach yields excellent results, with accuracies that are comparable to or better than those of existing state-of-the-art approaches.
arXiv Detail & Related papers (2023-07-17T14:40:16Z) - FedYolo: Augmenting Federated Learning with Pretrained Transformers [61.56476056444933]
In this work, we investigate pretrained transformers (PTF) to achieve on-device learning goals.
We show that larger scale shrinks the accuracy gaps between alternative approaches and improves robustness.
Finally, it enables clients to solve multiple unrelated tasks simultaneously using a single PTF.
arXiv Detail & Related papers (2023-07-10T21:08:52Z) - MetaModulation: Learning Variational Feature Hierarchies for Few-Shot
Learning with Fewer Tasks [63.016244188951696]
We propose a method for few-shot learning with fewer tasks, which is by metaulation.
We modify parameters at various batch levels to increase the meta-training tasks.
We also introduce learning variational feature hierarchies by incorporating the variationalulation.
arXiv Detail & Related papers (2023-05-17T15:47:47Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - AMS-Net: Adaptive Multiscale Sparse Neural Network with Interpretable
Basis Expansion for Multiphase Flow Problems [8.991619150027267]
We propose an adaptive sparse learning algorithm that can be applied to learn the physical processes and obtain a sparse representation of the solution given a large snapshot space.
The information of the basis functions are incorporated in the loss function, which minimizes the differences between the downscaled reduced order solutions and reference solutions at multiple time steps.
More numerical tests are performed on two-phase multiscale flow problems to show the capability and interpretability of the proposed method on complicated applications.
arXiv Detail & Related papers (2022-07-24T13:12:43Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Memory Efficient Adaptive Attention For Multiple Domain Learning [3.8907870897999355]
Training CNNs from scratch on new domains typically demands large numbers of labeled images and computations.
One way to reduce these requirements is to modularize the CNN architecture and freeze the weights of the heavier modules.
Recent studies have proposed alternative modular architectures and schemes that lead to a reduction in the number of trainable parameters needed.
arXiv Detail & Related papers (2021-10-21T08:33:29Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - Multi-Task Reinforcement Learning with Soft Modularization [25.724764855681137]
Multi-task learning is a very challenging problem in reinforcement learning.
We introduce an explicit modularization technique on policy representation to alleviate this optimization issue.
We show our method improves both sample efficiency and performance over strong baselines by a large margin.
arXiv Detail & Related papers (2020-03-30T17:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.