Multi-task neural networks by learned contextual inputs
- URL: http://arxiv.org/abs/2303.00788v1
- Date: Wed, 1 Mar 2023 19:25:52 GMT
- Title: Multi-task neural networks by learned contextual inputs
- Authors: Anders T. Sandnes, Bjarne Grimstad, Odd Kolbj{\o}rnsen
- Abstract summary: It is a multi-task learning architecture based on a fully shared neural network and an augmented input vector containing trainable task parameters.
The architecture is interesting due to its powerful task mechanism, which facilitates a low-dimensional task parameter space.
The architecture's performance is compared to similar neural network architectures on ten datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper explores learned-context neural networks. It is a multi-task
learning architecture based on a fully shared neural network and an augmented
input vector containing trainable task parameters. The architecture is
interesting due to its powerful task adaption mechanism, which facilitates a
low-dimensional task parameter space. Theoretically, we show that a scalar task
parameter is sufficient for universal approximation of all tasks, which is not
necessarily the case for more common architectures. Evidence towards the
practicality of such a small task parameter space is given empirically. The
task parameter space is found to be well-behaved, and simplifies workflows
related to updating models as new data arrives, and training new tasks when the
shared parameters are frozen. Additionally, the architecture displays
robustness towards cases with few data points. The architecture's performance
is compared to similar neural network architectures on ten datasets.
Related papers
- Learning Compact Neural Networks with Deep Overparameterised Multitask
Learning [0.0]
We present a simple, efficient and effective multitask learning over parameterisation neural network design.
Experiments on two challenging multitask datasets (NYUv2 and COCO) demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-08-25T10:51:02Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion [89.92242000948026]
We propose a transformer architecture based on a dedicated encoder/decoder framework.
Through a dynamic expansion of special tokens, we specialize each forward of our decoder network on a task distribution.
Our strategy scales to a large number of tasks while having negligible memory and time overheads.
arXiv Detail & Related papers (2021-11-22T16:29:06Z) - Conceptual Expansion Neural Architecture Search (CENAS) [1.3464152928754485]
We present an approach called Conceptual Expansion Neural Architecture Search (CENAS)
It combines a sample-efficient, computational creativity-inspired transfer learning approach with neural architecture search.
It finds models faster than naive architecture search via transferring existing weights to approximate the parameters of the new model.
arXiv Detail & Related papers (2021-10-07T02:29:26Z) - Elastic Architecture Search for Diverse Tasks with Different Resources [87.23061200971912]
We study a new challenging problem of efficient deployment for diverse tasks with different resources, where the resource constraint and task of interest corresponding to a group of classes are dynamically specified at testing time.
Previous NAS approaches seek to design architectures for all classes simultaneously, which may not be optimal for some individual tasks.
We present a novel and general framework, called Elastic Architecture Search (EAS), permitting instant specializations at runtime for diverse tasks with various resource constraints.
arXiv Detail & Related papers (2021-08-03T00:54:27Z) - Rethinking Hard-Parameter Sharing in Multi-Task Learning [20.792654758645302]
Hard parameter sharing in multi-task learning (MTL) allows tasks to share some of model parameters, reducing storage cost and improving prediction accuracy.
The common sharing practice is to share bottom layers of a deep neural network among tasks while using separate top layers for each task.
Using separate bottom-layer parameters could achieve significantly better performance than the common practice.
arXiv Detail & Related papers (2021-07-23T17:26:40Z) - Neural Architecture Search From Fr\'echet Task Distance [50.9995960884133]
We show how the distance between a target task and each task in a given set of baseline tasks can be used to reduce the neural architecture search space for the target task.
The complexity reduction in search space for task-specific architectures is achieved by building on the optimized architectures for similar tasks instead of doing a full search without using this side information.
arXiv Detail & Related papers (2021-03-23T20:43:31Z) - Exploring Flip Flop memories and beyond: training recurrent neural
networks with key insights [0.0]
We study the implementation of a temporal processing task, specifically a 3-bit Flip Flop memory.
The obtained networks are meticulously analyzed to elucidate dynamics, aided by an array of visualization and analysis tools.
arXiv Detail & Related papers (2020-10-15T16:25:29Z) - Emerging Relation Network and Task Embedding for Multi-Task Regression
Problems [5.953831950062808]
Multi-task learning (mtl) provides state-of-the-art results in many applications of computer vision and natural language processing.
This article provides a comparative study of the following recent and important mtl architectures.
We introduce a new mtl architecture named emerging relation network (ern) which can be considered as an extension of the sluice network.
arXiv Detail & Related papers (2020-04-29T09:02:24Z) - MTL-NAS: Task-Agnostic Neural Architecture Search towards
General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL)
In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones.
We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z) - Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features.
Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.