The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by
Isolating Task-Specific Subnetworks in Feedforward Neural Networks
- URL: http://arxiv.org/abs/2207.08821v1
- Date: Mon, 18 Jul 2022 15:07:13 GMT
- Title: The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by
Isolating Task-Specific Subnetworks in Feedforward Neural Networks
- Authors: Jacob Renn, Ian Sotnek, Benjamin Harvey, Brian Caffo
- Abstract summary: We identify a methodology and network representational structure which allows a pruned network to employ previously unused weights to learn subsequent tasks.
We show that networks trained using our approaches are able to learn multiple tasks, which may be related or unrelated, in parallel or in sequence without sacrificing performance on any task or exhibiting catastrophic forgetting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks have seen an explosion of usage and research in the past
decade, particularly within the domains of computer vision and natural language
processing. However, only recently have advancements in neural networks yielded
performance improvements beyond narrow applications and translated to expanded
multitask models capable of generalizing across multiple data types and
modalities. Simultaneously, it has been shown that neural networks are
overparameterized to a high degree, and pruning techniques have proved capable
of significantly reducing the number of active weights within the network while
largely preserving performance. In this work, we identify a methodology and
network representational structure which allows a pruned network to employ
previously unused weights to learn subsequent tasks. We employ these
methodologies on well-known benchmarking datasets for testing purposes and show
that networks trained using our approaches are able to learn multiple tasks,
which may be related or unrelated, in parallel or in sequence without
sacrificing performance on any task or exhibiting catastrophic forgetting.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Neural Subnetwork Ensembles [2.44755919161855]
This dissertation introduces and formalizes a low-cost framework for constructing Subnetwork Ensembles.
Child networks are formed by sampling, perturbing, and optimizingworks from a trained parent model.
Our findings reveal that this approach can greatly improve training efficiency, parametric utilization, and generalization performance.
arXiv Detail & Related papers (2023-11-23T17:01:16Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - Dynamic Neural Network for Multi-Task Learning Searching across Diverse
Network Topologies [14.574399133024594]
We present a new MTL framework that searches for optimized structures for multiple tasks with diverse graph topologies.
We design a restricted DAG-based central network with read-in/read-out layers to build topologically diverse task-adaptive structures.
arXiv Detail & Related papers (2023-03-13T05:01:50Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - Topological Uncertainty: Monitoring trained neural networks through
persistence of activation graphs [0.9786690381850356]
In industrial applications, data coming from an open-world setting might widely differ from the benchmark datasets on which a network was trained.
We develop a method to monitor trained neural networks based on the topological properties of their activation graphs.
arXiv Detail & Related papers (2021-05-07T14:16:03Z) - Graph-Based Neural Network Models with Multiple Self-Supervised
Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points.
We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.