One Network Fits All? Modular versus Monolithic Task Formulations in
Neural Networks
- URL: http://arxiv.org/abs/2103.15261v1
- Date: Mon, 29 Mar 2021 01:16:42 GMT
- Title: One Network Fits All? Modular versus Monolithic Task Formulations in
Neural Networks
- Authors: Atish Agarwala, Abhimanyu Das, Brendan Juba, Rina Panigrahy, Vatsal
Sharan, Xin Wang, Qiuyi Zhang
- Abstract summary: We show that a single neural network is capable of simultaneously learning multiple tasks from a combined data set.
We study how the complexity of learning such combined tasks grows with the complexity of the task codes.
- Score: 36.07011014271394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Can deep learning solve multiple tasks simultaneously, even when they are
unrelated and very different? We investigate how the representations of the
underlying tasks affect the ability of a single neural network to learn them
jointly. We present theoretical and empirical findings that a single neural
network is capable of simultaneously learning multiple tasks from a combined
data set, for a variety of methods for representing tasks -- for example, when
the distinct tasks are encoded by well-separated clusters or decision trees
over certain task-code attributes. More concretely, we present a novel analysis
that shows that families of simple programming-like constructs for the codes
encoding the tasks are learnable by two-layer neural networks with standard
training. We study more generally how the complexity of learning such combined
tasks grows with the complexity of the task codes; we find that combining many
tasks may incur a sample complexity penalty, even though the individual tasks
are easy to learn. We provide empirical support for the usefulness of the
learning bounds by training networks on clusters, decision trees, and SQL-style
aggregation.
Related papers
- OmniVec: Learning robust representations with cross modal sharing [28.023214572340336]
We present an approach to learn multiple tasks, in multiple modalities, with a unified architecture.
The proposed network is composed of task specific encoders, a common trunk in the middle, followed by task specific prediction heads.
We train the network on all major modalities, e.g. visual, audio, text and 3D, and report results on $22$ diverse and challenging public benchmarks.
arXiv Detail & Related papers (2023-11-07T14:00:09Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - PaRT: Parallel Learning Towards Robust and Transparent AI [4.160969852186451]
This paper takes a parallel learning approach for robust and transparent AI.
A deep neural network is trained in parallel on multiple tasks, where each task is trained only on a subset of the network resources.
We show that the network does indeed use learned knowledge from some tasks in other tasks, through shared representations.
arXiv Detail & Related papers (2022-01-24T09:03:28Z) - Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big
Task [24.618094251341958]
Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.
We propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks.
Our experimental results demonstrate its effectiveness in comparison with the state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-07T02:46:47Z) - Multi-Task Neural Processes [105.22406384964144]
We develop multi-task neural processes, a new variant of neural processes for multi-task learning.
In particular, we propose to explore transferable knowledge from related tasks in the function space to provide inductive bias for improving each individual task.
Results demonstrate the effectiveness of multi-task neural processes in transferring useful knowledge among tasks for multi-task learning.
arXiv Detail & Related papers (2021-11-10T17:27:46Z) - On the relationship between disentanglement and multi-task learning [62.997667081978825]
We take a closer look at the relationship between disentanglement and multi-task learning based on hard parameter sharing.
We show that disentanglement appears naturally during the process of multi-task neural network training.
arXiv Detail & Related papers (2021-10-07T14:35:34Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Multi-Task Learning with Deep Neural Networks: A Survey [0.0]
Multi-task learning (MTL) is a subfield of machine learning in which multiple tasks are simultaneously learned by a shared model.
We give an overview of multi-task learning methods for deep neural networks, with the aim of summarizing both the well-established and most recent directions within the field.
arXiv Detail & Related papers (2020-09-10T19:31:04Z) - Learning to Branch for Multi-Task Learning [12.49373126819798]
We present an automated multi-task learning algorithm that learns where to share or branch within a network.
We propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure.
arXiv Detail & Related papers (2020-06-02T19:23:21Z) - Deep Multimodal Neural Architecture Search [178.35131768344246]
We devise a generalized deep multimodal neural architecture search (MMnas) framework for various multimodal learning tasks.
Given multimodal input, we first define a set of primitive operations, and then construct a deep encoder-decoder based unified backbone.
On top of the unified backbone, we attach task-specific heads to tackle different multimodal learning tasks.
arXiv Detail & Related papers (2020-04-25T07:00:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.