Rethinking Neural Operations for Diverse Tasks
- URL: http://arxiv.org/abs/2103.15798v1
- Date: Mon, 29 Mar 2021 17:50:39 GMT
- Title: Rethinking Neural Operations for Diverse Tasks
- Authors: Nicholas Roberts and Mikhail Khodak and Tri Dao and Liam Li and
Christopher R\'e and Ameet Talwalkar
- Abstract summary: We introduce a search space of neural operations called XD-Operations that mimic the inductive bias of standard multichannel convolutions.
We show how to transform any backbone network such as LeNet or ResNet into an architecture search space over XD-Operations.
Our approach consistently yields models with lower error than baseline networks and sometimes even lower error than expert-designed domain-specific approaches.
- Score: 22.998431126801254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An important goal of neural architecture search (NAS) is to automate-away the
design of neural networks on new tasks in under-explored domains. Motivated by
this broader vision for NAS, we study the problem of enabling users to discover
the right neural operations given data from their specific domain. We introduce
a search space of neural operations called XD-Operations that mimic the
inductive bias of standard multichannel convolutions while being much more
expressive: we prove that XD-operations include many named operations across
several application areas. Starting with any standard backbone network such as
LeNet or ResNet, we show how to transform it into an architecture search space
over XD-operations and how to traverse the space using a simple weight-sharing
scheme. On a diverse set of applications--image classification, solving partial
differential equations (PDEs), and sequence modeling--our approach consistently
yields models with lower error than baseline networks and sometimes even lower
error than expert-designed domain-specific approaches.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Analyzing Populations of Neural Networks via Dynamical Model Embedding [10.455447557943463]
A core challenge in the interpretation of deep neural networks is identifying commonalities between the underlying algorithms implemented by distinct networks trained for the same task.
Motivated by this problem, we introduce DYNAMO, an algorithm that constructs low-dimensional manifold where each point corresponds to a neural network model, and two points are nearby if the corresponding neural networks enact similar high-level computational processes.
DYNAMO takes as input a collection of pre-trained neural networks and outputs a meta-model that emulates the dynamics of the hidden states as well as the outputs of any model in the collection.
arXiv Detail & Related papers (2023-02-27T19:00:05Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by
Isolating Task-Specific Subnetworks in Feedforward Neural Networks [0.0]
We identify a methodology and network representational structure which allows a pruned network to employ previously unused weights to learn subsequent tasks.
We show that networks trained using our approaches are able to learn multiple tasks, which may be related or unrelated, in parallel or in sequence without sacrificing performance on any task or exhibiting catastrophic forgetting.
arXiv Detail & Related papers (2022-07-18T15:07:13Z) - NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural
Network Synthesis [53.106414896248246]
We present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge.
Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application.
arXiv Detail & Related papers (2020-09-28T01:48:45Z) - Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network.
We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z) - Exploiting Heterogeneity in Operational Neural Networks by Synaptic
Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs)
In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons.
Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z) - Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian
Processes [12.798516310559375]
We show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings.
We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.
arXiv Detail & Related papers (2020-07-15T15:16:18Z) - VINNAS: Variational Inference-based Neural Network Architecture Search [2.685668802278155]
We present a differentiable variational inference-based NAS method for searching sparse convolutional neural networks.
Our method finds diverse network cells, while showing state-of-the-art accuracy with up to almost 2 times fewer non-zero parameters.
arXiv Detail & Related papers (2020-07-12T21:47:35Z) - MTL-NAS: Task-Agnostic Neural Architecture Search towards
General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL)
In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones.
We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.